Removal of Hallucination on Hallucination: Debate-Augmented RAG

Wentao Hu¹ · Wengyu Zhang¹ · Yiyang Jiang¹ · Chen Jason Zhang¹ · Xiaoyong Wei^2,1,* · Qing Li¹

¹The Hong Kong Polytechnic University ²Sichuan University
^*Corresponding author

🌟 Introduction

This repository is the official implementation of DRAG (Debate-Augmented RAG), a novel training-free framework, designed to reduce hallucinations in Retrieval-Augmented Generation (RAG) systems.

Retrieval-Augmented Generation (RAG) is designed to mitigate hallucinations in large language models (LLMs) by retrieving relevant external knowledge to support factual generation. However, biased or erroneous retrieval results can mislead the generation, compounding the hallucination problem rather than solving it. In this work, we refer to this cascading issue as Hallucination on Hallucination, a phenomenon where the model's factual mistakes are not just due to internal reasoning flaws, but also triggered or worsened by unreliable retrieved content.

To address this, we implement DRAG, a training-free framework that integrates multi-agent debate (MAD) mechanisms into both the retrieval and generation stages. These debates help dynamically refine queries, reduce bias, and promote factually grounded, robust answers.

🔥 News

🔥 [May 24, 2025]: The paper and Code were released!
🔥 [May 16, 2025]: Our paper was accepted by ACL 2025!

🚀 QuickStart

Installation

Clone this repository, then create a drag conda environment and install the packages.

# clone repository
git clone https://github.com/Huenao/Debate-Augmented-RAG.git
# create conda env
conda create -n drag
conda activate drag
# install packages
pip install -r requirements.txt

💡 Note: If you encounter any issues when installing the Python packages using the commands above, we recommend following the official installation instructions provided by FlashRAG#Installation instead.

Dataset

The datasets used in this project follow the same format as those pre-processed by FlashRAG#Datasets. All datasets are available at Huggingface datasets.

After downloading the dataset, please create a /dataset folder in the project directory and place the downloaded data inside. The directory structure should be as follows:

Debate-Augmented-RAG
├── assets
├── config
├── dataset
│   ├── 2wiki
│   ├── HotpotQA
│   ├── NQ
│   ├── PopQA
│   ├── StrategyQA
│   └── TriviaQA
├── misc
├── model
├── output
├── wiki_corpus
├── main.py
├── README.md
└── requirements.txt

Currently, DRAG supports only the following six datasets: NQ, TriviaQA, PopQA, 2WikiMultihopQA, HotpotQA, and StrategyQA.

💡 Note: If you wish to use a custom dataset path, simply modify the data_dir field in config/base_config.yaml accordingly.

Document Corpus & Index

We use the wiki18_100w dataset provided by FlashRAG#index as the document corpus, along with the preprocessed index generated by its e5-base-v2 retriever.

Both the document corpus and the index can be downloaded from the retrieval_corpus folder at ModelScope.

After downloading, please create a wiki_corpus folder in the project root and place both files inside it. The directory structure should look like:

Debate-Augmented-RAG
├── assets
├── config
├── dataset
├── misc
├── model
├── output
├── wiki_corpus
│   ├── e5_flat_inner.index
│   └── wiki18_100w.jsonl
├── main.py
├── README.md
└── requirements.txt

💡 Note: If you wish to use a custom corpus path, simply modify the index_path and corpus_path field in config/base_config.yaml accordingly.

Base Model

This project supports all LLMs compatible with HuggingFace and vLLM. Please specify the path to your downloaded model using the model2path field in config/base_config.yaml.

Running

python main.py --method_name "DRAG" \
               --gpu_id "0" \
               --dataset_name "StrategyQA" \
               --generator_model "llama3-8B-instruct"

--method_name: Specifies the RAG method to use, supports: DRAG (default), Naive Gen, Naive RAG, FLARE, Iter-RetGen, IRCoT, SuRe, Self-RAG, MAD.
--gpu_id: Specifies the GPU device ID to use.
--dataset_name: Specifies the dataset to use, supports the following options: NQ, TriviaQA, PopQA, 2wiki, HotpotQA, StrategyQA
--generator_model: Specifies the generation model to use.

Additionally, when using DRAG, you can customize the number of debate rounds for each phase by setting the --max_query_debate_rounds and --max_answer_debate_rounds parameters, which control the Retrieval Debate and Response Debate stages, respectively.

Visualization

To better visualize and analyze the results, we use HTML4Vision to generate HTML files that visualize the entire debate process.

python misc/vis_naive_gen.py --file_path output/path-to-results-folder

✨ Acknowledgments

FlashRAG: A Python toolkit for the reproduction and development of Retrieval Augmented Generation (RAG) research. We thank the authors for their excellent work.

🔗 Citation

Thank you for your interest in our work. If this work is useful to you, please cite it as follows:

@inproceedings{hu-etal-2025-removal,
    title = "Removal of Hallucination on Hallucination: Debate-Augmented {RAG}",
    author = "Hu, Wentao  and
      Zhang, Wengyu  and
      Jiang, Yiyang  and
      Zhang, Chen Jason  and
      Wei, Xiaoyong  and
      Qing, Li",
    booktitle = "Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)",
    month = jul,
    year = "2025",
    address = "Vienna, Austria",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2025.acl-long.770/",
    pages = "15839--15853",
    ISBN = "979-8-89176-251-0",
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Removal of Hallucination on Hallucination: Debate-Augmented RAG

Wentao Hu¹ · Wengyu Zhang¹ · Yiyang Jiang¹ · Chen Jason Zhang¹ · Xiaoyong Wei^2,1,* · Qing Li¹

🌟 Introduction

🔥 News

🚀 QuickStart

Installation

Dataset

Document Corpus & Index

Base Model

Running

Visualization

✨ Acknowledgments

🔗 Citation

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 78 Commits
assets		assets
config		config
misc		misc
model		model
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
main.py		main.py
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

Removal of Hallucination on Hallucination: Debate-Augmented RAG

Wentao Hu1 · Wengyu Zhang1 · Yiyang Jiang1 · Chen Jason Zhang1 · Xiaoyong Wei2,1,* · Qing Li1

🌟 Introduction

🔥 News

🚀 QuickStart

Installation

Dataset

Document Corpus & Index

Base Model

Running

Visualization

✨ Acknowledgments

🔗 Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Wentao Hu¹ · Wengyu Zhang¹ · Yiyang Jiang¹ · Chen Jason Zhang¹ · Xiaoyong Wei^2,1,* · Qing Li¹

Packages