Skip to content

Latest commit

 

History

History

README.md

RAG-R1 : Incentivizing the Search and Reasoning Capabilities of LLMs through Multi-query Parallelism

license license Hugging Face ModelScope
If you like our project, please give us a star ⭐ on GitHub for the latest update.

✨ News

[2025/07/01] 🔥🔥🔥RAG-R1 We propose RAG-R1, a deepsearch training framework that incentivizing the search and reasoning capabilities of LLMs through multi-query parallelism.

💡 Overview

RAG-R1 is a novel training framework that enables LLMs to adaptively leverage internal and external knowledge, significantly enhancing their reasoning capabilities. The cornerstone of our work is the integration of multi-query parallelism, an architectural innovation designed to directly address the prohibitive latency and inherent brittleness of conventional single-query methods, thereby bolstering reasoning robustness and reducing inference time. Extensive experiments on seven QA benchmarks demonstrate the effectiveness of our method, which outperforms the strongest baseline by up to 13.7% and decreases inference time by 11.1%. This dual advancement confirms that our method achieves a superior trade-off, simultaneously boosting the model's reasoning robustness and inference efficiency.

🌐 Framework

RAG-R1-framework

Overall framework of RAG-R1.

🏆 Performance

Main Results

RAG-R1-result

Performance comparisons on QA benchmarks under the EM metric. The best and second best results are bold and underlined, respectively.

🛠 Dependencies

We train our model based on python==3.10.13, torch==2.4.0 and cuda==12.1.105. To begin using this repo, you need to install the required dependencies.

pip install -r requirements.txt

🚀 Quick Start

Train a reasoning + search LLM on our dataset with BGE-large-en-v1.5 as the retriever and Wikipedia corpus released by KILT in August 2019 as the corpus.

(1) Download the index and corpus.

You can build the corpus and index using the following commands or directly download the corpus and index we have constructed.

## Process wiki full texts
cd RAG-R1/data/kilt
wget http://dl.fbaipublicfiles.com/KILT/kilt_knowledgesource.json

## Build corpus and index
cd RAG-R1/scripts/build_corpus
python split_kilt_to_100.py
## We divided the corpus into 10 parts and built indexes separately for each to prevent memory overflow.
python build_corpus_embedding.py
python build_corpus_index.py

You can also download the corpus and index chunks we have built here and run following commands to merge the indexes.

cd RAG-R1/scripts/build_corpus
python build_corpus_index.py

(2) Launch a local retrieval server.

python start_retrieval.py

## test retrieval server
python RAG-R1/rag_r1/search/retrieval_request.py

(3) Run SFT training with Qwen2.5-7b-instruct.

The SFT training data are available in the data/SFT directory. We recommend using the LLaMA-Factory for SFT training.

(4) Run RL training(PPO) with SFT model or Qwen2.5-7b-instruct.

## prepare RL training data
cd RAG-R1/scripts/data_process
##  single-query mode RL data
python hotpotqa_search.py
##  multi-query parallelism RL data
python hotpotqa_search_mq.py

cd RAG-R1
## run single-query mode RL training
bash train_ppo.sh
## run multi-query parallelism RL training
bash train_ppo_mq.sh

For single-query mode training, you need to replace the content of rag_r1/llm_agent/generation.py with the content of rag_r1/llm_agent/generation_sq.py.

🙏 Acknowledgements

RAG-R1 is inspired by Deepseek-R1 with its implementation based on veRL and Search-r1. We deeply appreciate the contributions of these teams to open-source research and development.

✍️ Citation

Please cite our repo if our works are helpful for your research.

@article{RAG-R1,
  title={RAG-R1 : Incentivizing the Search and Reasoning Capabilities of LLMs through Multi-query Parallelism}, 
  author={Zhiwen Tan and Jiaming Huang and Qintong Wu and Hongxuan Zhang and Chenyi Zhuang and Jinjie Gu},
  journal={arXiv preprint arXiv:2507.02962v5},
  year={2025}
}

📧 Contact

For any question or feedback, please reach out to us at [email protected].

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.