😎 Awesome AI Auto-Research

This repository accompanies the survey paper "AI for Auto-Research: Roadmap & User Guide" and tracks papers on AI-assisted and automated scientific research, covering the full research lifecycle.

Background


Phase 1: Creation	Generating novel research ideas, searching and synthesizing literature, running coding experiments, and creating publication-quality tables and figures. This phase spans Idea Generation, Literature Review, Coding & Experiments, and Tables & Figures.
Phase 2: Writing	Drafting, editing, and polishing academic manuscripts. AI assistance ranges from semi-automated grammar and citation tools to fully automated paper generation — the most commercially mature yet ethically contested stage.
Phase 3: Validation	Automated peer review generation, reviewer-paper matching, review quality assessment, and AI-assisted author rebuttals. This phase covers Peer Review and Rebuttal & Revision.
Phase 4: Dissemination	Converting papers into slides, posters, videos, websites, and social media content. Each output format targets a different audience and demands its own design logic and AI tool chain.

0. Background
1. Idea Generation
2. Literature Review & Paper Search
3. Coding & Experimentation
4. Tables & Figures
5. Peer Review
6. Rebuttal
7. Paper Writing
8. Dissemination (Paper2X)
9. End-to-End Systems
10. Societal & Critical Perspectives
11. Surveys & Curated Lists
12. Tools & GitHub Repos

1. Idea Generation

LLM Internal Knowledge-Based Generation

Model	Paper	Venue	Website	GitHub

`SciMON`	SciMON: Scientific Inspiration Machines Optimized for Novelty	ACL 2024
`ResearchAgent`	ResearchAgent: Iterative Research Idea Generation over Scientific Literature with Large Language Models	arXiv 2024	-
`AI Scientist`	The AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery	arXiv 2024	-
`Idea Gen Agent`	Can LLMs Generate Novel Research Ideas? A Large Scale Human Study with 100+ NLP Researchers	arXiv 2024	-	-
`Chain of Ideas`	Chain of Ideas: Revolutionizing Research Via Novel Idea Development with LLM Agents	arXiv 2024	-
`Spark`	Spark: A System for Scientifically Creative Idea Generation	ICCC 2025	-	-
`AI Scientist v2`	The AI Scientist v2: Workshop-Level Automated Scientific Discovery via Agentic Tree Search	arXiv 2025	-
`AI-Researcher`	AI-Researcher: Autonomous Scientific Innovation	arXiv 2025	-	-
`EvoIdeator`	EvoIdeator: Evolving Scientific Ideas through Checklist-Grounded Reinforcement Learning	arXiv 2026	-	-
`Rubric Rewards`	Training AI Co-Scientists Using Rubric Rewards	arXiv 2025	-	-

External Signal-Driven Generation

Model	Paper	Venue	Website	GitHub

`SciAgents`	SciAgents: Automating Scientific Discovery through Multi-Agent Intelligent Graph Reasoning	arXiv 2024	-
`MOOSE-Chem`	MOOSE-Chem: Large Language Models for Rediscovering Unseen Chemistry Scientific Hypotheses	arXiv 2024	-	-
`Nova`	Nova: An Iterative Planning and Search Approach to Enhance Novelty and Diversity of LLM Generated Ideas	arXiv 2024	-	-
`SciPIP`	SciPIP: An LLM-based Scientific Paper Idea Proposer	arXiv 2024	-
`IdeaSynth`	IdeaSynth: Iterative Research Idea Development Through Evolving and Composing Idea Facets with Literature-Grounded Feedback	CHI 2025		-
`MOOSE-Chem2`	MOOSE-Chem2: Exploring LLM Limits in Fine-Grained Scientific Hypothesis Discovery via Hierarchical Search	arXiv 2025	-	-
`Kosmos`	Kosmos: An AI Scientist for Autonomous Discovery	arXiv 2025	-	-
`Literature-Driven Scientific Theories`	Generating Literature-Driven Scientific Theories at Scale	arXiv 2026	-	-
`FlowPIE`	FlowPIE: Test-Time Scientific Idea Evolution with Flow-Guided Literature Exploration	arXiv 2026	-	-

Multi-Agent Collaborative Generation

Model	Paper	Venue	Website	GitHub

`VirSci (Many Heads)`	Many Heads Are Better Than One: Improved Scientific Idea Generation by an LLM-Based Multi-Agent System	ACL 2025	-
`LLMs for Combinatorial Creativity`	LLMs can Realize Combinatorial Creativity: Generating Creative Ideas via LLMs for Scientific Research	arXiv 2024	-	-
`Multi-Agent Dial.`		SIGDIAL 2025		-
`Deep Ideation`	Deep Ideation: Designing LLM Agents to Generate Novel Research Ideas on Scientific Concept Network	arXiv 2025	-
`Artificial Hivemind`	Artificial Hivemind: The Open-Ended Homogeneity of Language Models (and Beyond)	NeurIPS 2025	-	-

Novelty and Feasibility Assessment

Model	Paper	Venue	Website	GitHub

`IdeaBench`		arXiv 2024	-	-
`LiveIdeaBench`	LiveIdeaBench: Evaluating LLMs' Scientific Creativity and Idea Generation with Minimal Context	arXiv 2024	-	-
`ResearchBench`	ResearchBench: Benchmarking LLMs in Scientific Discovery via Inspiration-Based Task Decomposition	ACL 2026	-	-
`AI Idea Bench 2025`	AI Idea Bench 2025: AI Research Idea Generation Benchmark	arXiv 2025	-
`The Ideation-Execution Gap`	The Ideation-Execution Gap: Execution Outcomes of LLM-Generated versus Human Research Ideas	arXiv 2025	-	-
`HeurekaBench`	HeurekaBench: A Benchmarking Framework for AI Co-scientist	ICLR 2026
`Why LLMs Aren't Scientists Yet`	Why LLMs Aren't Scientists Yet	arXiv 2026	-	-
`AI Can Learn Scientific Taste`	AI Can Learn Scientific Taste	arXiv 2026	-	-
`HindSight`	HindSight: Evaluating LLM-Generated Research Ideas via Future Impact	arXiv 2026	-	-
`DeepInnovator`	DeepInnovator: Triggering the Innovative Capabilities of LLMs	arXiv 2026	-

2. Literature Review & Paper Search

Literature Retrieval

Model	Paper	Venue	Website	GitHub

`LitLLM`	LitLLM: A Toolkit for Literature Review with Large Language Models	arXiv 2024		-
`OpenResearcher`		EMNLP 2024		-
`CiteME`	CiteME: Can Language Models Accurately Cite Scientific Claims?	arXiv 2024	-	-
`LitSearch`	LitSearch: A Retrieval Benchmark for Scientific Literature Search	arXiv 2024	-
`PaperQA2`	Language Agents Achieve Superhuman Synthesis of Scientific Knowledge	arXiv 2024	-
`PaSa`	PaSa: An LLM Agent for Comprehensive Academic Paper Search	arXiv 2025	-

Survey & Related Work Generation

Model	Paper	Venue	Website	GitHub

`ChatPaper`	-	GitHub 2023	-
`PaperQA`	PaperQA: Retrieval-Augmented Generative Agent for Scientific Research	arXiv 2023	-
`STORM`	Assisting in Writing Wikipedia-like Articles From Scratch with Large Language Models	arXiv 2024	-
`AutoSurvey`	AutoSurvey: Large Language Models Can Automatically Write Surveys	arXiv 2024	-
`GPT Researcher`	-	GitHub 2024	-
`LLMs for Automated Literature Review`	Large Language Models for Automated Literature Review	arXiv 2024	-	-
`SurveyX`	SurveyX: Academic Survey Automation via Large Language Models	arXiv 2025		-
`SurveyForge`	SurveyForge: On the Outline Heuristics, Memory-Driven Generation, and Multi-dimensional Evaluation for Automated Survey Writing	arXiv 2025	-
`CiteGeist`	Citegeist: Automated Generation of Related Work Analysis on the arXiv Corpus	arXiv 2025		-
`InteractiveSurvey`	InteractiveSurvey: An LLM-based Personalized and Interactive Survey Paper Generation System	arXiv 2025	-
`Agentic AutoSurvey`	Agentic AutoSurvey: Let LLMs Survey LLMs	arXiv 2025	-	-
`LiRA`	LiRA: A Multi-Agent Framework for Reliable and Readable Literature Review Generation	arXiv 2025	-	-
`SurveyG`	SurveyG: A Multi-Agent LLM Framework with Hierarchical Citation Graph for Automated Survey Generation	arXiv 2025	-	-
`IterSurvey`	IterSurvey: Deep Literature Survey Automation with an Iterative Workflow	arXiv 2025	-
`CiteLLM`	CiteLLM: An Agentic Platform for Trustworthy Scientific Reference Discovery	arXiv 2026	-	-

Deep Research Agents

Model	Paper	Venue	Website	GitHub

`ASReview`	An Open Source Machine Learning Framework for Efficient and Transparent Systematic Reviews	Nature MI 2021
`CHIME`	CHIME: LLM-Assisted Hierarchical Organization of Scientific Studies for Literature Review Support	arXiv 2024	-	-
`OpenScholar`	OpenScholar: Synthesizing Scientific Literature with Retrieval-Augmented LMs	Nature 2025	-	-
`DeepResearchAgent`	-	GitHub 2025	-
`DeerFlow`	-	GitHub 2025	-
`Tongyi DeepResearch`	-	GitHub 2025	-
`Auto-Deep-Research`	AutoAgent: A Fully-Automated and Zero-Code Framework for LLM Agents	arXiv 2025	-	-
`O-Researcher`	O-Researcher: An Open Ended Deep Research Model via Multi-Agent Distillation and Agentic RL	arXiv 2026	-	-

Retrieval and Synthesis Quality Assessment

Model	Paper	Venue	Website	GitHub

`ReportBench`	ReportBench: Evaluating Deep Research Agents via Academic Survey Tasks	arXiv 2025	-
`DeepScholar-Bench`	DeepScholar-Bench: A Live Benchmark and Automated Evaluation for Generative Research Synthesis	arXiv 2025	-
`IDRBench`	IDRBench: Interactive Deep Research Benchmark	arXiv 2026	-	-
`ScholarGym`	ScholarGym: Benchmarking Large Language Model Capabilities in the Information-Gathering Stage of Deep Research	arXiv 2026	-	-
`SciNetBench`	SciNetBench: A Relation-Aware Benchmark for Scientific Literature Retrieval Agents	arXiv 2026	-	-

3. Coding & Experimentation

Code Generation

Model	Paper	Venue	Website	GitHub

`SWE-Bench`	SWE-bench: Can Language Models Resolve Real-World GitHub Issues?	ICLR 2024
`SWE-agent`		arXiv 2024	-
`OpenHands`	OpenHands: An Open Platform for AI Software Developers as Generalist Agents	ICLR 2025	-
`SWE-Bench Pro`	SWE-Bench Pro: Can AI Agents Solve Long-Horizon Software Engineering Tasks?	arXiv 2025	-	-
`SWE-EVO`	SWE-EVO: Benchmarking Coding Agents in Long-Horizon Software Evolution Scenarios	arXiv 2025	-	-

Paper-to-Code

Model	Paper	Venue	Website	GitHub

`FunSearch`	Mathematical Discoveries from Program Search with Large Language Models	Nature 2024
`SciCode`	SciCode: A Research Coding Benchmark Curated by Scientists	arXiv 2024
`SciReplicate-Bench`	SciReplicate-Bench: Benchmarking LLMs in Agent-driven Algorithmic Reproduction from Research Papers	arXiv 2025
`PaperBench`	PaperBench: Evaluating AI's Ability to Replicate AI Research	arXiv 2025	-
`PaperCoder`	Paper2Code: Automating Code Generation from Scientific Papers in Machine Learning	arXiv 2025	-
`ResearchCodeBench`	ResearchCodeBench: Benchmarking LLMs on Implementing Novel ML Research Code	arXiv 2025	-	-

Experiment Execution & Orchestration

Model	Paper	Venue	Website	GitHub

`BioPlanner`	BioPlanner: Automatic Evaluation of LLMs on Protocol Planning	arXiv 2023	-
`Coscientist`	Autonomous Chemical Research with Large Language Models	Nature 2023		-
`MLAgentBench`	MLAgentBench: Evaluating Language Agents on Machine Learning Experimentation	arXiv 2024	-
`DS-Agent`	DS-Agent: Automated Data Science by Empowering Large Language Models with Case-Based Reasoning	arXiv 2024	-
`ChemCrow`	ChemCrow: Augmenting Large Language Models with Chemistry Tools	Nature MI 2024		-
`CRISPR-GPT`	CRISPR-GPT for Agentic Automation of Gene-Editing Experiments	arXiv 2024	-	-
`MLR-Copilot`	MLR-Copilot: Autonomous Machine Learning Research based on Large Language Models Agents	arXiv 2024	-	-
`MLE-Bench`	MLE-Bench: Evaluating Machine Learning Agents on Machine Learning Engineering	arXiv 2024	-	-
`autoresearch (Karpathy)`	-	GitHub 2025	-
`Dolphin`	Dolphin: Moving Towards Closed-loop Auto-research through Thinking, Practice, and Feedback	arXiv 2025		-
`AIDE`	AIDE: AI-Driven Exploration in the Space of Code	arXiv 2025	-	-
`Curie`	Curie: Toward Rigorous and Automated Scientific Experimentation with AI Agents	arXiv 2025	-
`MLGym`	MLGym: A New Framework and Benchmark for Advancing AI Research Agents	arXiv 2025	-	-
`CodeScientist`	CodeScientist: End-to-End Semi-Automated Scientific Discovery with Code-based Experimentation	arXiv 2025	-
`AutoReproduce`	AutoReproduce: Automatic AI Experiment Reproduction with Paper Lineage	arXiv 2025	-
`R&D-Agent`	R&D-Agent: An LLM-Agent Framework Towards Autonomous Data Science	arXiv 2025	-
`InternAgent / NovelSeek`	InternAgent: When Agent Becomes the Scientist -- Building Closed-Loop System from Hypothesis to Verification	arXiv 2025
`AlphaEvolve`	AlphaEvolve: A Coding Agent for Scientific and Algorithmic Discovery	arXiv 2025	-	-
`MLR-Bench`	MLR-Bench: Evaluating AI Agents on Open-Ended Machine Learning Research	arXiv 2025	-	-
`Execution-Grounded AI Research`	Towards Execution-Grounded Automated AI Research	arXiv 2026	-	-
`SciNav`	SciNav: A General Agent Framework for Scientific Coding Tasks	arXiv 2026	-	-
`Learn to Discover`	Learning to Discover at Test Time	arXiv 2026	-	-
`FrontierScience`	FrontierScience: Evaluating AI's Ability to Perform Expert-Level Scientific Tasks	arXiv 2026	-	-
`EvoScientist`	EvoScientist: Towards Multi-Agent Evolving AI Scientists for End-to-End Scientific Discovery	arXiv 2026

Code Correctness and Reproducibility Assessment

Model	Paper	Venue	Website	GitHub

`DiscoveryWorld`	DiscoveryWorld: A Virtual Environment for Developing and Evaluating Automated Scientific Discovery Agents	arXiv 2024	-
`DiscoveryBench`	DiscoveryBench: Towards Data-Driven Discovery with Large Language Models	arXiv 2024	-
`Lab-Bench`	Lab-Bench: Measuring Capabilities of Language Models for Biology Research	arXiv 2024
`InfiAgent-DABench`	InfiAgent-DABench: Evaluating Agents on Data Analysis Tasks	arXiv 2024	-	-
`ScienceAgentBench`	ScienceAgentBench: Toward Rigorous Assessment of Language Agents for Data-Driven Scientific Discovery	arXiv 2024	-	-
`RE-Bench`	RE-Bench: Evaluating Frontier AI R&D Capabilities of Language Model Agents Against Human Experts	arXiv 2024	-	-
`KernelBench`	KernelBench: Can LLMs Write Efficient GPU Kernels?	arXiv 2025	-
`TritonBench`	TritonBench: Benchmarking Large Language Model Capabilities for Generating Triton Operators	arXiv 2025	-
`AstaBench`	AstaBench: Rigorous Benchmarking of AI Agents with a Scientific Research Suite	arXiv 2025	-
`EXP-Bench`	EXP-Bench: Can AI Conduct AI Research Experiments?	arXiv 2025	-
`ResearchClawBench`	Probing Scientific General Intelligence of LLMs with Scientist-Aligned Workflows	arXiv 2025	-
`PostTrainBench`	PostTrainBench: Can LLM Agents Automate LLM Post-Training?	arXiv 2026	-

4. Tables & Figures

Scientific Figure Generation

Model	Paper	Venue	Website	GitHub

`ChartGPT`	ChartGPT: Leveraging LLMs to Generate Charts from Abstract Natural Language	arXiv 2023	-	-
`MatPlotAgent`	MatPlotAgent: Method and Evaluation for LLM-Based Agentic Scientific Data Visualization	arXiv 2024	-	-
`DiagramAgent`		CVPR 2025		-
`StarVector`		CVPR 2025		-
`VisCoder`		EMNLP Findings 2025		-
`PlotGen`	PlotGen: Multi-Agent LLM-based Scientific Data Visualization via Multimodal Feedback	arXiv 2025	-	-
`CoDA`	CoDA: Agentic Systems for Collaborative Data Visualization	arXiv 2025	-	-
`VIS-Shepherd`	VIS-Shepherd: Constructing Critic for LLM-based Data Visualization Generation	arXiv 2025	-	-
`PaperBanana`	PaperBanana: Automating Academic Illustration for AI Scientists	arXiv 2026	-	-
`AutoFigure-Edit`	AutoFigure-Edit: Generating Editable Scientific Illustration	arXiv 2026
`AutoFigure`		ICLR 2026	-
`SAIL`	Setting SAIL: Leveraging Scientist-AI-Loops for Rigorous Visualization Tools	arXiv 2026	-	-
-	AI-Generated Figures in Academic Publishing: Policies, Tools, and Practical Guidelines	arXiv 2026	-	-

Table Understanding & Generation

Model	Paper	Venue	Website	GitHub

`Chain-of-Table`	Chain-of-Table: Evolving Tables in Reasoning Chain for Table Understanding	arXiv 2024	-	-
`ArxivDIGESTables`		EMNLP 2024		-
`ShowTable`	ShowTable: Unlocking Creative Table Visualization with Collaborative Reflection and Refinement	arXiv 2025	-	-
`Table2LaTeX-RL`	Table2LaTeX-RL: Converting Table Images to High-Fidelity LaTeX Code Using Reinforced Multimodal Language Models	arXiv 2025	-	-

Mathematical Formulas & TikZ

Model	Paper	Venue	Website	GitHub

`AutomaTikZ`		ICLR 2024		-
`DeTikZify`		NeurIPS 2024		-
`TikZilla`	TikZilla: Scaling Text-to-TikZ with High-Quality Data and Reinforcement Learning	arXiv 2026	-	-

Visual Fidelity and Scientific Accuracy Assessment

Model	Paper	Venue	Website	GitHub

`AbGen`		ACL 2025		-
`PlotCraft`	PlotCraft: Pushing the Limits of LLMs for Complex and Interactive Data Visualization	arXiv 2025	-	-
`TeXpert`	TeXpert: Multi-Level Benchmark for LaTeX Code Generation	arXiv 2025	-	-
`SciFig`	SciFig: Towards Automating Scientific Figure Generation	arXiv 2026	-	-
`SciFlow-Bench`	SciFlow-Bench: Evaluating Structure-Aware Scientific Diagram Generation via Inverse Parsing	arXiv 2026	-	-
`FigureBench`	AutoFigure: Generating and Refining Publication-Ready Scientific Illustrations	ICLR 2026	-

5. Peer Review

Automated Review Generation

Model	Paper	Venue	Website	GitHub

`ChatReviewer`	-	GitHub 2023	-
`MARG`	MARG: Multi-Agent Review Generation for Scientific Papers	arXiv 2024	-	-
`Reviewer2`	Reviewer2: Optimizing Review Generation Through Prompt Generation	arXiv 2024	-	-
`AI-Peer-Review`	-	GitHub 2024	-
`DeepReviewer`	DeepReview: Improving LLM-based Paper Review with Human-like Deep Thinking Process	arXiv 2025		-
`OpenReviewer`	OpenReviewer: A Specialized Large Language Model for Generating Critical Scientific Paper Reviews	arXiv 2025		-
`REMOR`	REMOR: Automated Peer Review Generation with LLM Reasoning and Multi-Objective Reinforcement Learning	arXiv 2025	-	-
`ReviewRL`		EMNLP 2025		-
`ScholarPeer`	ScholarPeer: A Context-Aware Multi-Agent Framework for Automated Peer Review	arXiv 2026	-	-

Meta-Review & Reviewer Matching

Model	Paper	Venue	Website	GitHub

`AgentReview`		EMNLP 2024		-
`LLMs as Meta-Reviewers' Assistants`		NAACL 2025		-
`RATE`	RATE: Reviewer Profiling and Annotation-free Training for Expertise Ranking in Peer Review Systems	arXiv 2026	-	-

Review Quality Assessment

Model	Paper	Venue	Website	GitHub

`ReviewMT`	Peer Review as A Multi-Turn and Long-Context Dialogue with Role-Based Interactions	arXiv 2024	-	-
`ClaimCheck`	ClaimCheck: How Grounded are LLM Critiques of Scientific Papers?	arXiv 2025	-	-
`ReviewAgents`	ReviewAgents: Bridging the Gap Between Human and AI-Generated Paper Reviews	arXiv 2025	-	-
`Stanford Agentic`	-	Web 2025		-
`CycleResearcher`	CycleResearcher: Improving Automated Research via Automated Review	arXiv 2025		-
`ReViewGraph`	Automatic Paper Reviewing with Heterogeneous Graph Reasoning over LLM-Simulated Reviewer-Author Debates	arXiv 2025	-	-
`ReviewerToo`	ReviewerToo: Should AI Join The Program Committee? A Look At The Future of Peer Review	arXiv 2025	-	-
-	A Large-Scale Randomized Study of Large Language Model Feedback in Peer Review	Nature MI 2026		-

Adversarial Attacks & Bias Analysis

Model	Paper	Venue	Website	GitHub

`AI Review Lottery`	The AI Review Lottery: Widespread AI-Assisted Peer Reviews Boost Paper Scores and Acceptance Rates	arXiv 2024	-	-
`Raina et al.`		EMNLP 2024	-	-
`Ye et al.`	Are We There Yet? Revealing the Risks of Utilizing Large Language Models in Scholarly Peer Review	arXiv 2024	-	-
`Breaking the Reviewer`	Breaking the Reviewer: Assessing the Vulnerability of Large Language Models in Automated Peer Review Under Textual Adversarial Attacks	arXiv 2025	-	-
`Sahoo et al.`	When Reject Turns into Accept: Quantifying the Vulnerability of LLM-Based Scientific Reviewers to Indirect Prompt Injection	arXiv 2025	-	-
`Zhou et al.`	``Give a Positive Review Only'': An Early Investigation Into In-Paper Prompt Injection Attacks and Defenses for AI Reviewers	arXiv 2025	-	-
`Prompt Injection Attacks on Peer Review`	Prompt Injection Attacks on LLM Generated Reviews of Scientific Publications	arXiv 2025	-	-
`When Your Reviewer is an LLM`	When Your Reviewer is an LLM: Biases, Divergence, and Prompt Injection Risks in Peer Review	arXiv 2025	-	-

Detection & Policy

Model	Paper	Venue	Website	GitHub

`Monitoring AI-Modified Content`	Monitoring AI-Modified Content at Scale: A Case Study on the Impact of ChatGPT on AI Conference Peer Reviews	arXiv 2024	-	-
`AI Text Detection in Peer Review`	Is Your Paper Being Reviewed by an LLM? Benchmarking AI Text Detection in Peer Review	arXiv 2025	-	-
`Detecting LLM-Generated Peer Reviews`	Detecting LLM-Generated Peer Reviews	arXiv 2025	-	-
-	What Happens When Reviewers Receive AI Feedback in Their Reviews?	arXiv 2026	-	-
-	Policies Permitting LLM Use for Polishing Peer Reviews Are Currently Not Enforceable	arXiv 2026	-	-
-	More than Half of Researchers Now Use AI for Peer Review	Nature News 2026		-
`Major Conference Catches Illicit AI Use`	Major Conference Catches Illicit AI Use -- and Rejects Hundreds of Papers	Nature News 2026		-

6. Rebuttal

Reviewer Comment Analysis

Model	Paper	Venue	Website	GitHub

`Insights from ICLR Rebuttal Process`	Insights from ICLR Peer Review and Rebuttal Process	arXiv 2025	-	-

Automated Rebuttal Generation

Model	Paper	Venue	Website	GitHub

`DRPG`	DRPG: An Agentic Framework for Academic Rebuttal	arXiv 2026	-
`Paper2Rebuttal`	Paper2Rebuttal: A Multi-Agent Framework for Transparent Author Response Assistance	arXiv 2026		-
`RebuttalAgent`	RebuttalAgent: Strategic Persuasion in Academic Rebuttal via Theory of Mind	ICLR 2026	-	-
`Author-in-the-Loop`	Author-in-the-Loop Response Generation and Evaluation: Integrating Author Expertise and Intent in Responses to Peer Review	arXiv 2026	-	-

Rebuttal Effectiveness Assessment

Model	Paper	Venue	Website	GitHub

`Re2 Dataset`	Re2: A Consistency-Ensured Dataset for Full-stage Peer Review and Multi-Turn Rebuttal Discussions	arXiv 2025	-	-
`Commitment Checklist`	Commitment Checklist: Auditing Author Commitments in Peer Review	arXiv 2026	-	-

7. Paper Writing

Semi-Automated Writing Assistance

Model	Paper	Venue	Website	GitHub

`CoAuthor`	CoAuthor: Human-AI Collaborative Writing with Language Models	arXiv 2022	-	-
`Script&Shift`		CHI 2025		-
`AI in the Writing Process`		AIED 2025		-
`ScholarCopilot`	ScholarCopilot: Training LLMs for Academic Writing with Integrated Citation	arXiv 2025	-	-
`OpenDraft`	-	GitHub 2025	-
`DraftMarks`	DraftMarks: Enhancing Transparency in Human-AI Co-Writing Through Interactive Skeuomorphic Process Traces	arXiv 2025	-	-
`XtraGPT`	XtraGPT: Context-Aware and Controllable Academic Paper Revision	arXiv 2025	-	-
`PaperDebugger`	PaperDebugger: A Plugin-Based Multi-Agent System for In-Editor Academic Writing, Review, and Editing	arXiv 2025	-
`LimAgents`	Multi-Agent LLMs for Generating Research Limitations	arXiv 2026	-	-

Fully Automated Paper Generation

Model	Paper	Venue	Website	GitHub

`CycleResearcher`	CycleResearcher: Improving Automated Research via Automated Review	ICLR 2025		-
`FutureGen`	FutureGen: A RAG-based Approach to Generate the Future Work of Scientific Article	arXiv 2025	-	-
`APRES`	APRES: An Agentic Paper Revision and Evaluation System	arXiv 2026	-	-

Societal Analysis

Model	Paper	Venue	GitHub

-	Artificial Intelligence Tools Expand Scientists' Impact but Contract Science's Focus	Nature 2026	-
`Nature AI Survey`	More than Half of Researchers Now Use AI for Peer Review	Nature 2026	-

Writing Quality and AI Detection Assessment

Model	Paper	Venue	Website	GitHub

`Mapping LLM Use`	Mapping the Increasing Use of LLMs in Scientific Papers	arXiv 2024	-	-
`CycleReviewer`	CycleResearcher: Improving Automated Research via Automated Review	ICLR 2025		-
`Stanford Agentic`	-	Web 2025		-
`SciIG`	Let's Use ChatGPT To Write Our Paper! Benchmarking LLMs To Write the Introduction of a Research Paper	arXiv 2025	-	-
`Watermarking`	Detecting LLM-Generated Peer Reviews	arXiv 2025	-	-
`PaperWritingBench`	PaperOrchestra: A Multi-Agent Framework for Automated AI Research Paper Writing	arXiv 2026	-	-

8. Dissemination (Paper2X)

Slides & Presentations

Model	Paper	Venue	Website	GitHub

`DOC2PPT`		AAAI 2022		-
`AutoPresent`		CVPR 2025		-
`PASS`	PASS: Presentation Automation for Slide Generation and Speech	arXiv 2025	-	-
`Talk to Your Slides`	Talk to Your Slides: Efficient Slide Editing Agent	arXiv 2025	-	-
`PPTAgent`	PPTAgent: Generating and Evaluating Presentations Beyond Text-to-Slides	EMNLP 2025
`Auto-Slides`	Auto-Slides: An Interactive Multi-Agent System for Creating and Customizing Research Presentations	arXiv 2025		-
`SlideGen`	SlideGen: Collaborative Multimodal Agents for Scientific Slide Generation	arXiv 2025	-	-
`Paper2Slides`	-	GitHub 2025	-
`SlideTailor`	SlideTailor: Personalized Presentation Slide Generation for Scientific Papers	AAAI 2026
`DeepPresenter`	DeepPresenter: Environment-Grounded Reflection for Agentic Presentation Generation	arXiv 2026	-
`Office Raccoon (SenseTime)`	-	Web 2026		-

Posters

Model	Paper	Venue	Website	GitHub

`P2P (Paper-to-Poster)`	P2P: Automated Paper-to-Poster Generation and Fine-Grained Benchmark	arXiv 2025	-	-
`Paper2Poster`	Paper2Poster: Towards Multimodal Poster Automation from Scientific Papers	arXiv 2025	-
`PosterGen`	PosterGen: Aesthetic-Aware Paper-to-Poster Generation via Multi-Agent LLMs	arXiv 2025		-
`PosterForest`	PosterForest: Hierarchical Multi-Agent Collaboration for Scientific Poster Generation	arXiv 2025	-	-
`APEX`	APEX: Academic Poster Editing Agentic Expert	arXiv 2026	-
`PosterOmni`	PosterOmni: Generalized Artistic Poster Creation via Task Distillation and Unified Reward Feedback	arXiv 2026	-	-

Video & Web

Model	Paper	Venue	Website

`Preacher`	Preacher: Paper-to-Video Agentic System	ICCV 2025
`Paper2Video`	Paper2Video: Automatic Video Generation from Scientific Papers	arXiv 2025
`PresentAgent`	PresentAgent: Multimodal Agent for Presentation Video Generation	arXiv 2025	-
`Paper2Web`	Paper2Web: Let's Make Your Paper Alive!	arXiv 2025	-

Fidelity and Adoption Assessment

Model	Paper	Venue	Website

`PPTEval`	PPTAgent: Generating and Evaluating Presentations Beyond Text-to-Slides	EMNLP 2025	-
`PresentQuiz`	Paper2Video: Automatic Video Generation from Scientific Papers	arXiv 2025	-
`PresentEval`	PresentAgent: Multimodal Agent for Presentation Video Generation	arXiv 2025	-

9. End-to-End Systems

Fully Automated Research Systems

Model	Paper	Venue	Website	GitHub

`ResearchTown`	ResearchTown: Simulator of Human Research Community	ICML 2025
`Agent Laboratory`	Agent Laboratory: Using LLM Agents as Research Assistants	arXiv 2025	-	-
`AgentRxiv`	AgentRxiv: Towards Collaborative Autonomous Research	arXiv 2025	-	-
`ARIS`	-	GitHub 2025	-
`freephdlabor`	Build Your Personalized Research Group: A Multiagent Framework for Continual and Interactive Science Automation	arXiv 2025	-	-
`SciMaster`	SciMaster: Towards General-Purpose Scientific AI Agents	arXiv 2025	-
-	Towards End-to-End Automation of AI Research	Nature 2026
`Idea2Story`	Idea2Story: An Automated Pipeline for Transforming Research Concepts into Complete Scientific Narratives	arXiv 2026	-	-
`UniScientist`	-	Web 2026		-
`ASI-Evolve`	-	GitHub 2026	-
`FARS`	-	Web 2026		-
`AutoResearchClaw`	-	GitHub 2026	-
`CORAL`	CORAL: Towards Autonomous Multi-Agent Evolution for Open-Ended Discovery	arXiv 2026	-
`AutoSOTA`	AutoSOTA: An End-to-End Automated Research System for State-of-the-Art AI Model Discovery	arXiv 2026	-
`AiScientist-LH`	Toward Autonomous Long-Horizon Engineering for ML Research	arXiv 2026	-	-
`OpenResearcher (2026)`	OpenResearcher: A Fully Open Pipeline for Long-Horizon Deep Research Trajectory Synthesis	arXiv 2026	-
`Aletheia`	Towards Autonomous Mathematics Research	arXiv 2026	-

Domain-Specific Systems

Model	Paper	Venue	Website	GitHub

`AlphaFold 3`	Accurate Structure Prediction of Biomolecular Interactions with AlphaFold 3	Nature 2024		-
`Medical AI Scientist`	Towards a Medical AI Scientist	arXiv 2026	-	-

Evolutionary & Self-Improving Systems

Model	Paper	Venue	Website

`ShinkaEvolve`	ShinkaEvolve: Towards Open-Ended and Sample-Efficient Program Evolution	arXiv 2025	-
`Darwin Godel Machine`	Darwin Godel Machine: Open-Ended Evolution of Self-Improving Agents	arXiv 2025	-

Research Platforms & Infrastructure

Model	Paper	Venue	Website	GitHub

`Towards an AI co-scientist`	Towards an AI co-scientist	arXiv 2025	-	-
`PiFlow`	PiFlow: Principle-aware Scientific Discovery with Multi-Agent Collaboration	arXiv 2025	-	-
`LabClaw`	-	Web 2026		-
-	OpenAI Is Throwing Everything into Building a Fully Automated Researcher	MIT TR 2026		-

10. Societal & Critical Perspectives

Model	Paper	Venue	Website	GitHub

-	Navigating the Jagged Technological Frontier	Org. Sci. 2025		-
-	Reassessing Academic Integrity in the Age of AI	SSH Open 2025		-
`The AI Deskilling Paradox`	The AI Deskilling Paradox	CACM 2025		-
`Hidden Pitfalls of AI Scientist Systems`	The More You Automate, the Less You See: Hidden Pitfalls of AI Scientist Systems	arXiv 2025	-	-
`Rethinking Science in the Age of AI`	Rethinking Science in the Age of Artificial Intelligence	arXiv 2025	-	-
-	Measuring AI Ability to Complete Long Tasks	METR 2025		-
-	Towards a Science of Scaling Agent Systems	arXiv 2025	-	-
-	Artificial Intelligence Tools Expand Scientists' Impact but Contract Science's Focus	Nature 2026		-
-	[](https://www.cell.com/patterns/fulltext/S2666-3899(25) AI for Scientific Discovery is a Social Problem	Patterns 2026	[](https://www.cell.com/patterns/fulltext/S2666-3899(25)	-
`Research Integrity in the Age of AI`	Research Integrity and Academic Authority in the Age of Artificial Intelligence: From Discovery to Curation?	arXiv 2026	-	-
`SciSciGPT`	SciSciGPT: Advancing Human-AI Collaboration in the Science of Science	Nature CS 2026		-
`SimStep`	SimStep: Chain-of-Abstractions for Incremental Specification and Debugging of AI-Generated Interactive Simulations	arXiv 2025	-	-
`ConvoLearn`	ConvoLearn: A Learning Sciences Grounded Dataset for Fine-Tuning Dialogic AI Tutors	arXiv 2026	-	-
`AFIM: Academic Fraud Inclination Metric`	AFIM: Academic Fraud Inclination Metric	Web 2026		-
-	AI Researchers' Views on Automating AI R&D and Intelligence Explosions	arXiv 2026	-	-
-	AI Scientists Are Changing Research	Nature 2026		-
`Learning by Creating (Talk)`	Learning by Creating: A Human-Centered Vision for AI in Education	Talk 2026		-

11. Surveys & Curated Lists

Model	Paper	Venue	Website	GitHub

`LLM4SR`	LLM4SR: A Survey on Large Language Models for Scientific Research	arXiv 2025	-	-
`From Automation to Autonomy`	From Automation to Autonomy: A Survey on Large Language Models for Scientific Discovery	arXiv 2025	-	-
`AI4Research`	AI4Research: A Survey of Artificial Intelligence for Scientific Research	arXiv 2025	-	-
`A Survey of AI Scientists`	A Survey of AI Scientists	arXiv 2025	-	-
-	Large Language Models for Scientific Idea Generation: A Creativity-Centered Survey	arXiv 2025	-	-
-	Large Language Models for Automated Scholarly Paper Review: A Survey	Inf. Fusion 2025		-

12. Tools & GitHub Repos

Curated Lists

Repository	Stars	Description
Awesome-Deep-Research	~684	Up-to-date collection of agentic deep research resources
Awesome-Scientific-Language-Models	~647	Survey of scientific LLMs (EMNLP'24)
Awesome-LLM-Scientific-Discovery	~319	Three-level autonomy framework (EMNLP'25)
Awesome-AI-Scientist-Papers	~136	Resources on AI Scientist systems
Awesome-Auto-Research-Tools	~129	Automated research tools catalog
awesome-autoresearch	—	Autonomous improvement loops and research agents

Idea Generation

Repository	Stars	Description
Virtual-Scientists	~66	VirSci: multi-agent collaborative idea generation (ACL'25)
ResearchAgent	~29	Iterative idea proposal with reviewing agents

Literature Review

Repository	Stars	Description
paper-qa	~8,300	PaperQA2: superhuman RAG for scientific Q&A
local-deep-research	~4,200	Fully local deep research
researchgpt	~3,500	Conversational interaction with research papers
gpt-researcher	—	Autonomous agent for comprehensive online research
AutoSurvey	~462	Automated comprehensive literature surveys
storm	—	Wikipedia-style article generation (STORM)

Coding & Experiments

Repository	Stars	Description
autoresearch (Karpathy)	~55,400	Autonomous ML experiments, ~12 exp/hour overnight
Paper2Code	~4,300	Multi-agent ML paper to code transformation
RD-Agent	—	Microsoft's LLM framework for autonomous data science
MLAgentBench	~334	13 end-to-end ML experimentation tasks
SWE-bench	—	Real-world GitHub issue resolution benchmark

Peer Review

Repository	Stars	Description
paper-reviewer	~824	arXiv paper reviews + blog posts
ai-peer-review	~123	Multi-LLM reviews + meta-review synthesis
openreviewer	~9	Llama-8B fine-tuned on 79K expert reviews

⬆ Back to Top

Last updated: 2026-04-26 · Maintained by WorldBench

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
LICENSE		LICENSE
README.md		README.md

Folders and files

Latest commit

History

Repository files navigation

😎 Awesome AI Auto-Research

Background

Table of Contents

1. Idea Generation

LLM Internal Knowledge-Based Generation

External Signal-Driven Generation

Multi-Agent Collaborative Generation

Novelty and Feasibility Assessment

2. Literature Review & Paper Search

Literature Retrieval

Survey & Related Work Generation

Deep Research Agents

Retrieval and Synthesis Quality Assessment

3. Coding & Experimentation

Code Generation

Paper-to-Code

Experiment Execution & Orchestration

Code Correctness and Reproducibility Assessment

4. Tables & Figures

Scientific Figure Generation

Table Understanding & Generation

Mathematical Formulas & TikZ

Visual Fidelity and Scientific Accuracy Assessment

5. Peer Review

Automated Review Generation

Meta-Review & Reviewer Matching

Review Quality Assessment

Adversarial Attacks & Bias Analysis

Detection & Policy

6. Rebuttal

Reviewer Comment Analysis

Automated Rebuttal Generation

Rebuttal Effectiveness Assessment

7. Paper Writing

Semi-Automated Writing Assistance

Fully Automated Paper Generation

Societal Analysis

Writing Quality and AI Detection Assessment

8. Dissemination (Paper2X)

Slides & Presentations

Posters

Video & Web

Fidelity and Adoption Assessment

9. End-to-End Systems

Fully Automated Research Systems

Domain-Specific Systems

Evolutionary & Self-Improving Systems

Research Platforms & Infrastructure

10. Societal & Critical Perspectives

11. Surveys & Curated Lists

12. Tools & GitHub Repos

Curated Lists

Idea Generation

Literature Review

Coding & Experiments

Peer Review

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Packages