Skip to content

Huenao/Debate-Augmented-RAG

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

78 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Removal of Hallucination on Hallucination: Debate-Augmented RAG

1The Hong Kong Polytechnic University   2Sichuan University
*Corresponding author   

arXiv Conference

🌟 Introduction

This repository is the official implementation of DRAG (Debate-Augmented RAG), a novel training-free framework, designed to reduce hallucinations in Retrieval-Augmented Generation (RAG) systems.

DRAG

Retrieval-Augmented Generation (RAG) is designed to mitigate hallucinations in large language models (LLMs) by retrieving relevant external knowledge to support factual generation. However, biased or erroneous retrieval results can mislead the generation, compounding the hallucination problem rather than solving it. In this work, we refer to this cascading issue as Hallucination on Hallucination, a phenomenon where the model's factual mistakes are not just due to internal reasoning flaws, but also triggered or worsened by unreliable retrieved content.

To address this, we implement DRAG, a training-free framework that integrates multi-agent debate (MAD) mechanisms into both the retrieval and generation stages. These debates help dynamically refine queries, reduce bias, and promote factually grounded, robust answers.

🔥 News

🔥 [May 24, 2025]: The paper and Code were released!
🔥 [May 16, 2025]: Our paper was accepted by ACL 2025!

🚀 QuickStart

Installation

Clone this repository, then create a drag conda environment and install the packages.

# clone repository
git clone https://github.com/Huenao/Debate-Augmented-RAG.git
# create conda env
conda create -n drag
conda activate drag
# install packages
pip install -r requirements.txt

💡 Note: If you encounter any issues when installing the Python packages using the commands above, we recommend following the official installation instructions provided by FlashRAG#Installation instead.

Dataset

The datasets used in this project follow the same format as those pre-processed by FlashRAG#Datasets. All datasets are available at Huggingface datasets.

After downloading the dataset, please create a /dataset folder in the project directory and place the downloaded data inside. The directory structure should be as follows:

Debate-Augmented-RAG
├── assets
├── config
├── dataset
│   ├── 2wiki
│   ├── HotpotQA
│   ├── NQ
│   ├── PopQA
│   ├── StrategyQA
│   └── TriviaQA
├── misc
├── model
├── output
├── wiki_corpus
├── main.py
├── README.md
└── requirements.txt

Currently, DRAG supports only the following six datasets: NQ, TriviaQA, PopQA, 2WikiMultihopQA, HotpotQA, and StrategyQA.

💡 Note: If you wish to use a custom dataset path, simply modify the data_dir field in config/base_config.yaml accordingly.

Document Corpus & Index

We use the wiki18_100w dataset provided by FlashRAG#index as the document corpus, along with the preprocessed index generated by its e5-base-v2 retriever.

Both the document corpus and the index can be downloaded from the retrieval_corpus folder at ModelScope.

After downloading, please create a wiki_corpus folder in the project root and place both files inside it. The directory structure should look like:

Debate-Augmented-RAG
├── assets
├── config
├── dataset
├── misc
├── model
├── output
├── wiki_corpus
│   ├── e5_flat_inner.index
│   └── wiki18_100w.jsonl
├── main.py
├── README.md
└── requirements.txt

💡 Note: If you wish to use a custom corpus path, simply modify the index_path and corpus_path field in config/base_config.yaml accordingly.

Base Model

This project supports all LLMs compatible with HuggingFace and vLLM. Please specify the path to your downloaded model using the model2path field in config/base_config.yaml.

Running

python main.py --method_name "DRAG" \
               --gpu_id "0" \
               --dataset_name "StrategyQA" \
               --generator_model "llama3-8B-instruct"
  • --method_name: Specifies the RAG method to use, supports: DRAG (default), Naive Gen, Naive RAG, FLARE, Iter-RetGen, IRCoT, SuRe, Self-RAG, MAD.
  • --gpu_id: Specifies the GPU device ID to use.
  • --dataset_name: Specifies the dataset to use, supports the following options: NQ, TriviaQA, PopQA, 2wiki, HotpotQA, StrategyQA
  • --generator_model: Specifies the generation model to use.

Additionally, when using DRAG, you can customize the number of debate rounds for each phase by setting the --max_query_debate_rounds and --max_answer_debate_rounds parameters, which control the Retrieval Debate and Response Debate stages, respectively.

Visualization

To better visualize and analyze the results, we use HTML4Vision to generate HTML files that visualize the entire debate process.

python misc/vis_naive_gen.py --file_path output/path-to-results-folder

✨ Acknowledgments

FlashRAG: A Python toolkit for the reproduction and development of Retrieval Augmented Generation (RAG) research. We thank the authors for their excellent work.

🔗 Citation

Thank you for your interest in our work. If this work is useful to you, please cite it as follows:

@inproceedings{hu-etal-2025-removal,
    title = "Removal of Hallucination on Hallucination: Debate-Augmented {RAG}",
    author = "Hu, Wentao  and
      Zhang, Wengyu  and
      Jiang, Yiyang  and
      Zhang, Chen Jason  and
      Wei, Xiaoyong  and
      Qing, Li",
    booktitle = "Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)",
    month = jul,
    year = "2025",
    address = "Vienna, Austria",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2025.acl-long.770/",
    pages = "15839--15853",
    ISBN = "979-8-89176-251-0",
}

About

[ACL 2025] Removal of Hallucination on Hallucination: Debate-Augmented RAG

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages