LeanRAG: Knowledge-Graph-Based Generation## ๐ฅ๏ธ Command Line Interface (CLI)with Semantic Aggregation ## ๐ Getting Startednd Hierarchical Retrieval
LeanRAG is an efficient, open-source framework for Retrieval-Augmented Generation, leveraging knowledge graph structures with semantic aggregation and hierarchical retrieval to generate context-aware, concise, and high-fidelity responses.
Update (API-first refactor): All LLM interactions now use a unified OpenAI-compatible client (
OpenAIChatClient). Legacy local model orchestration (Ollama / vLLM) and the olderInstanceManagerhave been removed. TheCommonKGfolder is retained temporarily as a legacy reference and will be deprecatedโprefer the GraphRAG (default) extraction path.
- API-First Architecture: Cloud-based LLMs and embeddings (OpenAI/xAI, Together AI) - no local GPU requirements
- Modern Database Stack: Qdrant for vectors, Neo4j for graphs, SQLite for relational data
- Semantic Aggregation: Clusters entities into semantically coherent summaries and constructs explicit relations to form a navigable aggregation-level knowledge network.
- Hierarchical, Structure-Guided Retrieval: Initiates retrieval from fine-grained entities and traverses up the knowledge graph to gather rich, highly relevant evidence efficiently.
- Reduced Redundancy: Optimizes retrieval paths to significantly reduce redundant informationโLeanRAG achieves ~46% lower retrieval redundancy compared to flat retrieval baselines (based on benchmark evaluations).
- Benchmark Performance: Demonstrates superior performance across multiple QA benchmarks with improved response quality and retrieval efficiency.
LeanRAG's processing pipeline follows these core stages with a modern database architecture:
- Qdrant: Vector storage for efficient similarity search on embeddings
- Neo4j: Graph database for knowledge graph storage and Cypher-based queries
- SQLite: Lightweight relational storage for metadata and intermediate results
- API Services: Cloud-based LLMs and embedding models (no local GPU requirements)
-
Semantic Aggregation
- Group low-level entities into clusters; generate summary nodes and build adjacency relations among them for efficient navigation.
-
Knowledge Graph Construction
- Construct a multi-layer graph where nodes represent entities and aggregated summaries, with explicit inter-node relations for graph-based traversal.
-
Query Processing & Hierarchical Retrieval
- Anchor queries at the most relevant detailed entities ("bottom-up"), then traverse upward through the semantic aggregation graph to collect evidence spans.
-
Redundancy-Aware Synthesis
- Streamline retrieval paths and avoid overlapping content, ensuring concise evidence aggregation before generating responses.
-
Generation
- Use retrieved, well-structured evidence as input to an LLM to produce coherent, accurate, and contextually grounded answers.
LeanRAG provides a unified CLI tool for easy workflow management:
The CLI is included with LeanRAG. No additional installation required.
# Check system status
leanrag check
# Chunk documents
leanrag chunk datasets/mix/mix.jsonl --strategy semantic --chunk-size 1024
# Extract triples and entities
leanrag extract output/mix/mix_chunk.json
# Build knowledge graph
leanrag build output/mix/
# Query the knowledge graph
leanrag query "What is machine learning?" output/mix/ --top-k 5
# Run complete pipeline (with guidance)
leanrag pipeline datasets/mix/mix.jsonl --query "What is AI?"Validate environment setup and database connectivity.
Chunk documents for processing. Supports individual files or entire directories.
--strategy:semantic,hybrid, orfixed_token(default: semantic)--chunk-size: Maximum tokens per chunk (default: 1024)--overlap: Token overlap between chunks (default: 128)--output-dir: Output directory (default: output)
Examples:
# Chunk a single JSONL file
leanrag chunk datasets/mix/mix.jsonl --strategy semantic --chunk-size 1024
# Chunk a single PDF file
leanrag chunk document.pdf --strategy semantic --chunk-size 1024
# Chunk all supported files in a directory (recursively)
leanrag chunk datasets/ --strategy semantic --chunk-size 1024Extract triples and entities from chunks (GraphRAG path). Supports individual files or entire directories.
--output-dir: Output directory (default: output)
Examples:
# Extract from a single chunk file
leanrag extract output/mix/mix_chunk.json
# Extract from all chunk files in a directory
leanrag extract output/Build/update knowledge graph from new entities and relationships in SQLite database.
--config: Configuration file path
Note: Only processes entities/relationships marked as is_new=1, then marks them as processed (is_new=0).
Query the knowledge graph.
--top-k: Number of top entities to retrieve (default: 10)--chunks-file: Path to chunks file (auto-detected if not provided)
Run the complete pipeline with guidance.
--query: Optional query to run after building--output-dir: Output directory
- Python 3.10+
- Conda for environment management (optional)
- Neo4j (graph database) or Docker
- Qdrant (vector database) or Docker
- API keys for LLM services (OpenAI/xAI, Together AI)
-
Clone the repository:
git clone https://github.com/RaZzzyz/LeanRAG.git cd LeanRAG -
Create a virtual environment:
conda create -n leanrag python=3.11 conda activate leanrag # Or using venv python -m venv leanrag-env source leanrag-env/bin/activate # On Windows: leanrag-env\Scripts\activate
-
Install the required dependencies:
pip install -r requirements.txt
Note: LeanRAG uses API-based models exclusively. No local model weights or GPU requirements - all processing happens via cloud APIs.
-
Set up databases (see Database Architecture section above)
-
Configure environment variables by copying and editing
.env:cp .env.example .env # If example exists, otherwise create .env # Edit .env with your API keys and database URLs
LeanRAG uses a modern, API-first database architecture optimized for cloud deployment:
- Purpose: High-performance vector similarity search for embeddings
- Version: 1.15.1+
- Features: Cosine similarity, efficient indexing, scalable
- Configuration: Set
QDRANT_URL,QDRANT_API_KEY,QDRANT_COLLECTIONin.env
- Purpose: Native graph storage and traversal for knowledge graphs
- Version: 5.23+ Community Edition
- Features: Real-time graph analytics, ACID transactions, Cypher queries
- Configuration: Set
GRAPH_URI,GRAPH_USER,GRAPH_PASSWORDin.env
- Purpose: Lightweight storage for metadata and intermediate results
- Benefits: No server setup required, file-based, ACID compliant
- Location:
leanrag.dbin working directory
- LLM: OpenAI-compatible API (xAI, OpenAI, etc.)
- Embeddings: Together AI API for text embeddings
- Benefits: No local GPU requirements, scalable, cost-effective
-
Install Neo4j (for graph operations):
# Using Docker (recommended) docker run -d \ --name leanrag-neo4j \ -p 7474:7474 -p 7687:7687 \ -v ./neo4j_data:/data \ -v ./neo4j_logs:/logs \ -e NEO4J_AUTH=neo4j/test123456 \ neo4j:5.23-community -
Install Qdrant (for vector search):
# Using Docker docker run -p 6333:6333 qdrant/qdrant -
Configure Environment (
.envfile):# Qdrant vector database QDRANT_URL=http://localhost:6333 QDRANT_API_KEY= QDRANT_COLLECTION=leanrag-vectors # Neo4j graph database GRAPH_URI=bolt://localhost:7687 GRAPH_USER=neo4j GRAPH_PASSWORD=test123456 # API services TOGETHER_API_KEY=your_key_here OPENAI_API_KEY=your_xai_key_here OPENAI_BASE_URL=https://api.x.ai/v1
Hereโs a typical pipeline flow using the LeanRAG CLI:
# Chunk a single file
leanrag chunk datasets/mix/mix.jsonl --strategy semantic --chunk-size 1024
# Or chunk an entire directory (processes all .pdf and .jsonl files recursively)
leanrag chunk datasets/ --strategy semantic --chunk-size 1024Options:
--strategy: Choosesemantic(recommended),hybrid, orfixed_token--chunk-size: Maximum tokens per chunk (default: 1024)--overlap: Token overlap between chunks (default: 128)
Output: output/<dataset_name>/<dataset_name>_chunk.json with enhanced chunk metadata
GraphRAG Extraction
# Extract from a single chunk file
leanrag extract output/mix/mix_chunk.json
# Or extract from all chunk files in a directory (recursively)
leanrag extract output/Note: Entities and relationships are stored in SQLite database with is_new flag
# Build graph from new entities and relationships in SQLite
leanrag build output/Note: Only processes entities/relationships marked as is_new=1, then marks them as processed
leanrag query "What is machine learning?" output/mix/ --top-k 5Returns context-aware answers with evidence from the knowledge graph.
For new users, run the guided pipeline:
leanrag pipeline datasets/mix/mix.jsonl --query "What is AI?"This provides step-by-step guidance and automatically sets up the workflow.
If you prefer manual control, you can still call core Python scripts:
# Chunk documents
python file_chunk.py
# Extract triples (configure LLM endpoints first)
python GraphExtraction/chunk.py
# Build graph
python build_graph.py
# Query
python query_graph.pyWe gratefully acknowledge the use of the following open-source projects in our work:
-
nano-graphrag: a simple, easy-to-hack GraphRAG implementation
-
HiRAG: a novel hierarchy entity aggregation and optimized retrieval RAG method
If you find LeanRAG useful, please cite our paper:
@misc{zhang2025leanragknowledgegraphbasedgenerationsemantic,
title={LeanRAG: Knowledge-Graph-Based Generation with Semantic Aggregation and Hierarchical Retrieval},
author={Yaoze Zhang and Rong Wu and Pinlong Cai and Xiaoman Wang and Guohang Yan and Song Mao and Ding Wang and Botian Shi},
year={2025},
eprint={2508.10391},
archivePrefix={arXiv},
primaryClass={cs.AI},
url={https://arxiv.org/abs/2508.10391},
}
tqdm and tiktoken are optional. If they are not installed the code falls back gracefully (token counting becomes heuristic; progress bars disable automatically). To disable progress bars explicitly set:
export PROGRESS=0
The chat client looks for (in order): OPENAI_API_KEY, OPENAI_BASE_URL (defaults to https://api.openai.com/v1), and a model via MODEL_LLM or OPENAI_MODEL. Set one of the model env vars, e.g.:
export OPENAI_API_KEY=sk-...
export MODEL_LLM=grok-4-fast-reasoning
Embedding still uses Together API (TOGETHER_API_KEY).
