Skip to content

worldbench/awesome-ai-auto-research

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

7 Commits
Β 
Β 
Β 
Β 

Repository files navigation

Awesome Logo arXiv Visitors PR's Welcome

😎 Awesome AI Auto-Research

This repository accompanies the survey paper "AI for Auto-Research: Roadmap & User Guide" and tracks papers on AI-assisted and automated scientific research, covering the full research lifecycle.

Background

Phase 1: Creation Generating novel research ideas, searching and synthesizing literature, running coding experiments, and creating publication-quality tables and figures. This phase spans Idea Generation, Literature Review, Coding & Experiments, and Tables & Figures.
Phase 2: Writing Drafting, editing, and polishing academic manuscripts. AI assistance ranges from semi-automated grammar and citation tools to fully automated paper generation β€” the most commercially mature yet ethically contested stage.
Phase 3: Validation Automated peer review generation, reviewer-paper matching, review quality assessment, and AI-assisted author rebuttals. This phase covers Peer Review and Rebuttal & Revision.
Phase 4: Dissemination Converting papers into slides, posters, videos, websites, and social media content. Each output format targets a different audience and demands its own design logic and AI tool chain.

Table of Contents

1. Idea Generation

LLM Internal Knowledge-Based Generation

In chronological order, from the earliest to the latest.

Model Paper Venue Website GitHub
SciMON arXiv
SciMON: Scientific Inspiration Machines Optimized for Novelty
ACL 2024 Website GitHub
ResearchAgent arXiv
ResearchAgent: Iterative Research Idea Generation over Scientific Literature with Large Language Models
arXiv 2024 - GitHub
AI Scientist arXiv
The AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery
arXiv 2024 - GitHub
Idea Gen Agent arXiv
Can LLMs Generate Novel Research Ideas? A Large Scale Human Study with 100+ NLP Researchers
arXiv 2024 - -
Chain of Ideas arXiv
Chain of Ideas: Revolutionizing Research Via Novel Idea Development with LLM Agents
arXiv 2024 - GitHub
Spark arXiv
Spark: A System for Scientifically Creative Idea Generation
ICCC 2025 - -
AI Scientist v2 arXiv
The AI Scientist v2: Workshop-Level Automated Scientific Discovery via Agentic Tree Search
arXiv 2025 - GitHub
AI-Researcher arXiv
AI-Researcher: Autonomous Scientific Innovation
arXiv 2025 - -
EvoIdeator arXiv
EvoIdeator: Evolving Scientific Ideas through Checklist-Grounded Reinforcement Learning
arXiv 2026 - -
Rubric Rewards arXiv
Training AI Co-Scientists Using Rubric Rewards
arXiv 2025 - -

External Signal-Driven Generation

In chronological order, from the earliest to the latest.

Model Paper Venue Website GitHub
SciAgents arXiv
SciAgents: Automating Scientific Discovery through Multi-Agent Intelligent Graph Reasoning
arXiv 2024 - GitHub
MOOSE-Chem arXiv
MOOSE-Chem: Large Language Models for Rediscovering Unseen Chemistry Scientific Hypotheses
arXiv 2024 - -
Nova arXiv
Nova: An Iterative Planning and Search Approach to Enhance Novelty and Diversity of LLM Generated Ideas
arXiv 2024 - -
SciPIP arXiv
SciPIP: An LLM-based Scientific Paper Idea Proposer
arXiv 2024 - GitHub
IdeaSynth arXiv
IdeaSynth: Iterative Research Idea Development Through Evolving and Composing Idea Facets with Literature-Grounded Feedback
CHI 2025 Website -
MOOSE-Chem2 arXiv
MOOSE-Chem2: Exploring LLM Limits in Fine-Grained Scientific Hypothesis Discovery via Hierarchical Search
arXiv 2025 - -
Kosmos arXiv
Kosmos: An AI Scientist for Autonomous Discovery
arXiv 2025 - -
Literature-Driven Scientific Theories arXiv
Generating Literature-Driven Scientific Theories at Scale
arXiv 2026 - -
FlowPIE arXiv
FlowPIE: Test-Time Scientific Idea Evolution with Flow-Guided Literature Exploration
arXiv 2026 - -

Multi-Agent Collaborative Generation

In chronological order, from the earliest to the latest.

Model Paper Venue Website GitHub
VirSci (Many Heads) arXiv
Many Heads Are Better Than One: Improved Scientific Idea Generation by an LLM-Based Multi-Agent System
ACL 2025 - GitHub
LLMs for Combinatorial Creativity arXiv
LLMs can Realize Combinatorial Creativity: Generating Creative Ideas via LLMs for Scientific Research
arXiv 2024 - -
Multi-Agent Dial. arXiv SIGDIAL 2025 Website -
Deep Ideation arXiv
Deep Ideation: Designing LLM Agents to Generate Novel Research Ideas on Scientific Concept Network
arXiv 2025 - GitHub
Artificial Hivemind arXiv
Artificial Hivemind: The Open-Ended Homogeneity of Language Models (and Beyond)
NeurIPS 2025 - -

Novelty and Feasibility Assessment

In chronological order, from the earliest to the latest.

Model Paper Venue Website GitHub
IdeaBench arXiv arXiv 2024 - -
LiveIdeaBench arXiv
LiveIdeaBench: Evaluating LLMs' Scientific Creativity and Idea Generation with Minimal Context
arXiv 2024 - -
ResearchBench arXiv
ResearchBench: Benchmarking LLMs in Scientific Discovery via Inspiration-Based Task Decomposition
ACL 2026 - -
AI Idea Bench 2025 arXiv
AI Idea Bench 2025: AI Research Idea Generation Benchmark
arXiv 2025 - GitHub
The Ideation-Execution Gap arXiv
The Ideation-Execution Gap: Execution Outcomes of LLM-Generated versus Human Research Ideas
arXiv 2025 - -
HeurekaBench arXiv
HeurekaBench: A Benchmarking Framework for AI Co-scientist
ICLR 2026 Website GitHub
Why LLMs Aren't Scientists Yet arXiv
Why LLMs Aren't Scientists Yet
arXiv 2026 - -
AI Can Learn Scientific Taste arXiv
AI Can Learn Scientific Taste
arXiv 2026 - -
HindSight arXiv
HindSight: Evaluating LLM-Generated Research Ideas via Future Impact
arXiv 2026 - -
DeepInnovator arXiv
DeepInnovator: Triggering the Innovative Capabilities of LLMs
arXiv 2026 - GitHub

2. Literature Review & Paper Search

Literature Retrieval

In chronological order, from the earliest to the latest.

Model Paper Venue Website GitHub
LitLLM arXiv
LitLLM: A Toolkit for Literature Review with Large Language Models
arXiv 2024 Website -
OpenResearcher arXiv EMNLP 2024 Website -
CiteME arXiv
CiteME: Can Language Models Accurately Cite Scientific Claims?
arXiv 2024 - -
LitSearch arXiv
LitSearch: A Retrieval Benchmark for Scientific Literature Search
arXiv 2024 - GitHub
PaperQA2 arXiv
Language Agents Achieve Superhuman Synthesis of Scientific Knowledge
arXiv 2024 - GitHub
PaSa arXiv
PaSa: An LLM Agent for Comprehensive Academic Paper Search
arXiv 2025 - GitHub

Survey & Related Work Generation

In chronological order, from the earliest to the latest.

Model Paper Venue Website GitHub
ChatPaper - GitHub 2023 - GitHub
PaperQA arXiv
PaperQA: Retrieval-Augmented Generative Agent for Scientific Research
arXiv 2023 - GitHub
STORM arXiv
Assisting in Writing Wikipedia-like Articles From Scratch with Large Language Models
arXiv 2024 - GitHub
AutoSurvey arXiv
AutoSurvey: Large Language Models Can Automatically Write Surveys
arXiv 2024 - GitHub
GPT Researcher - GitHub 2024 - GitHub
LLMs for Automated Literature Review arXiv
Large Language Models for Automated Literature Review
arXiv 2024 - -
SurveyX arXiv
SurveyX: Academic Survey Automation via Large Language Models
arXiv 2025 Website -
SurveyForge arXiv
SurveyForge: On the Outline Heuristics, Memory-Driven Generation, and Multi-dimensional Evaluation for Automated Survey Writing
arXiv 2025 - GitHub
CiteGeist arXiv
Citegeist: Automated Generation of Related Work Analysis on the arXiv Corpus
arXiv 2025 Website -
InteractiveSurvey arXiv
InteractiveSurvey: An LLM-based Personalized and Interactive Survey Paper Generation System
arXiv 2025 - GitHub
Agentic AutoSurvey arXiv
Agentic AutoSurvey: Let LLMs Survey LLMs
arXiv 2025 - -
LiRA arXiv
LiRA: A Multi-Agent Framework for Reliable and Readable Literature Review Generation
arXiv 2025 - -
SurveyG arXiv
SurveyG: A Multi-Agent LLM Framework with Hierarchical Citation Graph for Automated Survey Generation
arXiv 2025 - -
IterSurvey arXiv
IterSurvey: Deep Literature Survey Automation with an Iterative Workflow
arXiv 2025 - GitHub
CiteLLM arXiv
CiteLLM: An Agentic Platform for Trustworthy Scientific Reference Discovery
arXiv 2026 - -

Deep Research Agents

In chronological order, from the earliest to the latest.

Model Paper Venue Website GitHub
ASReview arXiv
An Open Source Machine Learning Framework for Efficient and Transparent Systematic Reviews
Nature MI 2021 Website GitHub
CHIME arXiv
CHIME: LLM-Assisted Hierarchical Organization of Scientific Studies for Literature Review Support
arXiv 2024 - -
OpenScholar arXiv
OpenScholar: Synthesizing Scientific Literature with Retrieval-Augmented LMs
Nature 2025 - -
DeepResearchAgent - GitHub 2025 - GitHub
DeerFlow - GitHub 2025 - GitHub
Tongyi DeepResearch - GitHub 2025 - GitHub
Auto-Deep-Research arXiv
AutoAgent: A Fully-Automated and Zero-Code Framework for LLM Agents
arXiv 2025 - -
O-Researcher arXiv
O-Researcher: An Open Ended Deep Research Model via Multi-Agent Distillation and Agentic RL
arXiv 2026 - -

Retrieval and Synthesis Quality Assessment

In chronological order, from the earliest to the latest.

Model Paper Venue Website GitHub
ReportBench arXiv
ReportBench: Evaluating Deep Research Agents via Academic Survey Tasks
arXiv 2025 - GitHub
DeepScholar-Bench arXiv
DeepScholar-Bench: A Live Benchmark and Automated Evaluation for Generative Research Synthesis
arXiv 2025 - GitHub
IDRBench arXiv
IDRBench: Interactive Deep Research Benchmark
arXiv 2026 - -
ScholarGym arXiv
ScholarGym: Benchmarking Large Language Model Capabilities in the Information-Gathering Stage of Deep Research
arXiv 2026 - -
SciNetBench arXiv
SciNetBench: A Relation-Aware Benchmark for Scientific Literature Retrieval Agents
arXiv 2026 - -

3. Coding & Experimentation

Code Generation

In chronological order, from the earliest to the latest.

Model Paper Venue Website GitHub
SWE-Bench arXiv
SWE-bench: Can Language Models Resolve Real-World GitHub Issues?
ICLR 2024 Website GitHub
SWE-agent arXiv arXiv 2024 - GitHub
OpenHands arXiv
OpenHands: An Open Platform for AI Software Developers as Generalist Agents
ICLR 2025 - GitHub
SWE-Bench Pro arXiv
SWE-Bench Pro: Can AI Agents Solve Long-Horizon Software Engineering Tasks?
arXiv 2025 - -
SWE-EVO arXiv
SWE-EVO: Benchmarking Coding Agents in Long-Horizon Software Evolution Scenarios
arXiv 2025 - -

Paper-to-Code

In chronological order, from the earliest to the latest.

Model Paper Venue Website GitHub
FunSearch arXiv
Mathematical Discoveries from Program Search with Large Language Models
Nature 2024 Website GitHub
SciCode arXiv
SciCode: A Research Coding Benchmark Curated by Scientists
arXiv 2024 Website GitHub
SciReplicate-Bench arXiv
SciReplicate-Bench: Benchmarking LLMs in Agent-driven Algorithmic Reproduction from Research Papers
arXiv 2025 Website GitHub
PaperBench arXiv
PaperBench: Evaluating AI's Ability to Replicate AI Research
arXiv 2025 - GitHub
PaperCoder arXiv
Paper2Code: Automating Code Generation from Scientific Papers in Machine Learning
arXiv 2025 - GitHub
ResearchCodeBench arXiv
ResearchCodeBench: Benchmarking LLMs on Implementing Novel ML Research Code
arXiv 2025 - -

Experiment Execution & Orchestration

In chronological order, from the earliest to the latest.

Model Paper Venue Website GitHub
BioPlanner arXiv
BioPlanner: Automatic Evaluation of LLMs on Protocol Planning
arXiv 2023 - GitHub
Coscientist arXiv
Autonomous Chemical Research with Large Language Models
Nature 2023 Website -
MLAgentBench arXiv
MLAgentBench: Evaluating Language Agents on Machine Learning Experimentation
arXiv 2024 - GitHub
DS-Agent arXiv
DS-Agent: Automated Data Science by Empowering Large Language Models with Case-Based Reasoning
arXiv 2024 - GitHub
ChemCrow arXiv
ChemCrow: Augmenting Large Language Models with Chemistry Tools
Nature MI 2024 Website -
CRISPR-GPT arXiv
CRISPR-GPT for Agentic Automation of Gene-Editing Experiments
arXiv 2024 - -
MLR-Copilot arXiv
MLR-Copilot: Autonomous Machine Learning Research based on Large Language Models Agents
arXiv 2024 - -
MLE-Bench arXiv
MLE-Bench: Evaluating Machine Learning Agents on Machine Learning Engineering
arXiv 2024 - -
autoresearch (Karpathy) - GitHub 2025 - GitHub
Dolphin arXiv
Dolphin: Moving Towards Closed-loop Auto-research through Thinking, Practice, and Feedback
arXiv 2025 Website -
AIDE arXiv
AIDE: AI-Driven Exploration in the Space of Code
arXiv 2025 - -
Curie arXiv
Curie: Toward Rigorous and Automated Scientific Experimentation with AI Agents
arXiv 2025 - GitHub
MLGym arXiv
MLGym: A New Framework and Benchmark for Advancing AI Research Agents
arXiv 2025 - -
CodeScientist arXiv
CodeScientist: End-to-End Semi-Automated Scientific Discovery with Code-based Experimentation
arXiv 2025 - GitHub
AutoReproduce arXiv
AutoReproduce: Automatic AI Experiment Reproduction with Paper Lineage
arXiv 2025 - GitHub
R&D-Agent arXiv
R&D-Agent: An LLM-Agent Framework Towards Autonomous Data Science
arXiv 2025 - GitHub
InternAgent / NovelSeek arXiv
InternAgent: When Agent Becomes the Scientist -- Building Closed-Loop System from Hypothesis to Verification
arXiv 2025 Website GitHub
AlphaEvolve arXiv
AlphaEvolve: A Coding Agent for Scientific and Algorithmic Discovery
arXiv 2025 - -
MLR-Bench arXiv
MLR-Bench: Evaluating AI Agents on Open-Ended Machine Learning Research
arXiv 2025 - -
Execution-Grounded AI Research arXiv
Towards Execution-Grounded Automated AI Research
arXiv 2026 - -
SciNav arXiv
SciNav: A General Agent Framework for Scientific Coding Tasks
arXiv 2026 - -
Learn to Discover arXiv
Learning to Discover at Test Time
arXiv 2026 - -
FrontierScience arXiv
FrontierScience: Evaluating AI's Ability to Perform Expert-Level Scientific Tasks
arXiv 2026 - -
EvoScientist arXiv
EvoScientist: Towards Multi-Agent Evolving AI Scientists for End-to-End Scientific Discovery
arXiv 2026 Website GitHub

Code Correctness and Reproducibility Assessment

In chronological order, from the earliest to the latest.

Model Paper Venue Website GitHub
DiscoveryWorld arXiv
DiscoveryWorld: A Virtual Environment for Developing and Evaluating Automated Scientific Discovery Agents
arXiv 2024 - GitHub
DiscoveryBench arXiv
DiscoveryBench: Towards Data-Driven Discovery with Large Language Models
arXiv 2024 - GitHub
Lab-Bench arXiv
Lab-Bench: Measuring Capabilities of Language Models for Biology Research
arXiv 2024 Website GitHub
InfiAgent-DABench arXiv
InfiAgent-DABench: Evaluating Agents on Data Analysis Tasks
arXiv 2024 - -
ScienceAgentBench arXiv
ScienceAgentBench: Toward Rigorous Assessment of Language Agents for Data-Driven Scientific Discovery
arXiv 2024 - -
RE-Bench arXiv
RE-Bench: Evaluating Frontier AI R&D Capabilities of Language Model Agents Against Human Experts
arXiv 2024 - -
KernelBench arXiv
KernelBench: Can LLMs Write Efficient GPU Kernels?
arXiv 2025 - GitHub
TritonBench arXiv
TritonBench: Benchmarking Large Language Model Capabilities for Generating Triton Operators
arXiv 2025 - GitHub
AstaBench arXiv
AstaBench: Rigorous Benchmarking of AI Agents with a Scientific Research Suite
arXiv 2025 - GitHub
EXP-Bench arXiv
EXP-Bench: Can AI Conduct AI Research Experiments?
arXiv 2025 - GitHub
ResearchClawBench arXiv
Probing Scientific General Intelligence of LLMs with Scientist-Aligned Workflows
arXiv 2025 - GitHub
PostTrainBench arXiv
PostTrainBench: Can LLM Agents Automate LLM Post-Training?
arXiv 2026 - GitHub

4. Tables & Figures

Scientific Figure Generation

In chronological order, from the earliest to the latest.

Model Paper Venue Website GitHub
ChartGPT arXiv
ChartGPT: Leveraging LLMs to Generate Charts from Abstract Natural Language
arXiv 2023 - -
MatPlotAgent arXiv
MatPlotAgent: Method and Evaluation for LLM-Based Agentic Scientific Data Visualization
arXiv 2024 - -
DiagramAgent arXiv CVPR 2025 Website -
StarVector arXiv CVPR 2025 Website -
VisCoder arXiv EMNLP Findings 2025 Website -
PlotGen arXiv
PlotGen: Multi-Agent LLM-based Scientific Data Visualization via Multimodal Feedback
arXiv 2025 - -
CoDA arXiv
CoDA: Agentic Systems for Collaborative Data Visualization
arXiv 2025 - -
VIS-Shepherd arXiv
VIS-Shepherd: Constructing Critic for LLM-based Data Visualization Generation
arXiv 2025 - -
PaperBanana arXiv
PaperBanana: Automating Academic Illustration for AI Scientists
arXiv 2026 - -
AutoFigure-Edit arXiv
AutoFigure-Edit: Generating Editable Scientific Illustration
arXiv 2026 Website GitHub
AutoFigure arXiv ICLR 2026 - GitHub
SAIL arXiv
Setting SAIL: Leveraging Scientist-AI-Loops for Rigorous Visualization Tools
arXiv 2026 - -
- arXiv
AI-Generated Figures in Academic Publishing: Policies, Tools, and Practical Guidelines
arXiv 2026 - -

Table Understanding & Generation

In chronological order, from the earliest to the latest.

Model Paper Venue Website GitHub
Chain-of-Table arXiv
Chain-of-Table: Evolving Tables in Reasoning Chain for Table Understanding
arXiv 2024 - -
ArxivDIGESTables arXiv EMNLP 2024 Website -
ShowTable arXiv
ShowTable: Unlocking Creative Table Visualization with Collaborative Reflection and Refinement
arXiv 2025 - -
Table2LaTeX-RL arXiv
Table2LaTeX-RL: Converting Table Images to High-Fidelity LaTeX Code Using Reinforced Multimodal Language Models
arXiv 2025 - -

Mathematical Formulas & TikZ

In chronological order, from the earliest to the latest.

Model Paper Venue Website GitHub
AutomaTikZ arXiv ICLR 2024 Website -
DeTikZify arXiv NeurIPS 2024 Website -
TikZilla arXiv
TikZilla: Scaling Text-to-TikZ with High-Quality Data and Reinforcement Learning
arXiv 2026 - -

Visual Fidelity and Scientific Accuracy Assessment

In chronological order, from the earliest to the latest.

Model Paper Venue Website GitHub
AbGen arXiv ACL 2025 Website -
PlotCraft arXiv
PlotCraft: Pushing the Limits of LLMs for Complex and Interactive Data Visualization
arXiv 2025 - -
TeXpert arXiv
TeXpert: Multi-Level Benchmark for LaTeX Code Generation
arXiv 2025 - -
SciFig arXiv
SciFig: Towards Automating Scientific Figure Generation
arXiv 2026 - -
SciFlow-Bench arXiv
SciFlow-Bench: Evaluating Structure-Aware Scientific Diagram Generation via Inverse Parsing
arXiv 2026 - -
FigureBench arXiv
AutoFigure: Generating and Refining Publication-Ready Scientific Illustrations
ICLR 2026 - GitHub

5. Peer Review

Automated Review Generation

In chronological order, from the earliest to the latest.

Model Paper Venue Website GitHub
ChatReviewer - GitHub 2023 - GitHub
MARG arXiv
MARG: Multi-Agent Review Generation for Scientific Papers
arXiv 2024 - -
Reviewer2 arXiv
Reviewer2: Optimizing Review Generation Through Prompt Generation
arXiv 2024 - -
AI-Peer-Review - GitHub 2024 - GitHub
DeepReviewer arXiv
DeepReview: Improving LLM-based Paper Review with Human-like Deep Thinking Process
arXiv 2025 Website -
OpenReviewer arXiv
OpenReviewer: A Specialized Large Language Model for Generating Critical Scientific Paper Reviews
arXiv 2025 Website -
REMOR arXiv
REMOR: Automated Peer Review Generation with LLM Reasoning and Multi-Objective Reinforcement Learning
arXiv 2025 - -
ReviewRL arXiv EMNLP 2025 Website -
ScholarPeer arXiv
ScholarPeer: A Context-Aware Multi-Agent Framework for Automated Peer Review
arXiv 2026 - -

Meta-Review & Reviewer Matching

In chronological order, from the earliest to the latest.

Model Paper Venue Website GitHub
AgentReview arXiv EMNLP 2024 Website -
LLMs as Meta-Reviewers' Assistants arXiv NAACL 2025 Website -
RATE arXiv
RATE: Reviewer Profiling and Annotation-free Training for Expertise Ranking in Peer Review Systems
arXiv 2026 - -

Review Quality Assessment

In chronological order, from the earliest to the latest.

Model Paper Venue Website GitHub
ReviewMT arXiv
Peer Review as A Multi-Turn and Long-Context Dialogue with Role-Based Interactions
arXiv 2024 - -
ClaimCheck arXiv
ClaimCheck: How Grounded are LLM Critiques of Scientific Papers?
arXiv 2025 - -
ReviewAgents arXiv
ReviewAgents: Bridging the Gap Between Human and AI-Generated Paper Reviews
arXiv 2025 - -
Stanford Agentic - Web 2025 Website -
CycleResearcher arXiv
CycleResearcher: Improving Automated Research via Automated Review
arXiv 2025 Website -
ReViewGraph arXiv
Automatic Paper Reviewing with Heterogeneous Graph Reasoning over LLM-Simulated Reviewer-Author Debates
arXiv 2025 - -
ReviewerToo arXiv
ReviewerToo: Should AI Join The Program Committee? A Look At The Future of Peer Review
arXiv 2025 - -
- arXiv
A Large-Scale Randomized Study of Large Language Model Feedback in Peer Review
Nature MI 2026 Website -

Adversarial Attacks & Bias Analysis

In chronological order, from the earliest to the latest.

Model Paper Venue Website GitHub
AI Review Lottery arXiv
The AI Review Lottery: Widespread AI-Assisted Peer Reviews Boost Paper Scores and Acceptance Rates
arXiv 2024 - -
Raina et al. arXiv EMNLP 2024 - -
Ye et al. arXiv
Are We There Yet? Revealing the Risks of Utilizing Large Language Models in Scholarly Peer Review
arXiv 2024 - -
Breaking the Reviewer arXiv
Breaking the Reviewer: Assessing the Vulnerability of Large Language Models in Automated Peer Review Under Textual Adversarial Attacks
arXiv 2025 - -
Sahoo et al. arXiv
When Reject Turns into Accept: Quantifying the Vulnerability of LLM-Based Scientific Reviewers to Indirect Prompt Injection
arXiv 2025 - -
Zhou et al. arXiv
``Give a Positive Review Only'': An Early Investigation Into In-Paper Prompt Injection Attacks and Defenses for AI Reviewers
arXiv 2025 - -
Prompt Injection Attacks on Peer Review arXiv
Prompt Injection Attacks on LLM Generated Reviews of Scientific Publications
arXiv 2025 - -
When Your Reviewer is an LLM arXiv
When Your Reviewer is an LLM: Biases, Divergence, and Prompt Injection Risks in Peer Review
arXiv 2025 - -

Detection & Policy

In chronological order, from the earliest to the latest.

Model Paper Venue Website GitHub
Monitoring AI-Modified Content arXiv
Monitoring AI-Modified Content at Scale: A Case Study on the Impact of ChatGPT on AI Conference Peer Reviews
arXiv 2024 - -
AI Text Detection in Peer Review arXiv
Is Your Paper Being Reviewed by an LLM? Benchmarking AI Text Detection in Peer Review
arXiv 2025 - -
Detecting LLM-Generated Peer Reviews arXiv
Detecting LLM-Generated Peer Reviews
arXiv 2025 - -
- arXiv
What Happens When Reviewers Receive AI Feedback in Their Reviews?
arXiv 2026 - -
- arXiv
Policies Permitting LLM Use for Polishing Peer Reviews Are Currently Not Enforceable
arXiv 2026 - -
- arXiv
More than Half of Researchers Now Use AI for Peer Review
Nature News 2026 Website -
Major Conference Catches Illicit AI Use arXiv
Major Conference Catches Illicit AI Use -- and Rejects Hundreds of Papers
Nature News 2026 Website -

6. Rebuttal

Reviewer Comment Analysis

In chronological order, from the earliest to the latest.

Model Paper Venue Website GitHub
Insights from ICLR Rebuttal Process arXiv
Insights from ICLR Peer Review and Rebuttal Process
arXiv 2025 - -

Automated Rebuttal Generation

In chronological order, from the earliest to the latest.

Model Paper Venue Website GitHub
DRPG arXiv
DRPG: An Agentic Framework for Academic Rebuttal
arXiv 2026 - GitHub
Paper2Rebuttal arXiv
Paper2Rebuttal: A Multi-Agent Framework for Transparent Author Response Assistance
arXiv 2026 Website -
RebuttalAgent arXiv
RebuttalAgent: Strategic Persuasion in Academic Rebuttal via Theory of Mind
ICLR 2026 - -
Author-in-the-Loop arXiv
Author-in-the-Loop Response Generation and Evaluation: Integrating Author Expertise and Intent in Responses to Peer Review
arXiv 2026 - -

Rebuttal Effectiveness Assessment

In chronological order, from the earliest to the latest.

Model Paper Venue Website GitHub
Re2 Dataset arXiv
Re2: A Consistency-Ensured Dataset for Full-stage Peer Review and Multi-Turn Rebuttal Discussions
arXiv 2025 - -
Commitment Checklist arXiv
Commitment Checklist: Auditing Author Commitments in Peer Review
arXiv 2026 - -

7. Paper Writing

Semi-Automated Writing Assistance

In chronological order, from the earliest to the latest.

Model Paper Venue Website GitHub
CoAuthor arXiv
CoAuthor: Human-AI Collaborative Writing with Language Models
arXiv 2022 - -
Script&Shift arXiv CHI 2025 Website -
AI in the Writing Process arXiv AIED 2025 Website -
ScholarCopilot arXiv
ScholarCopilot: Training LLMs for Academic Writing with Integrated Citation
arXiv 2025 - -
OpenDraft - GitHub 2025 - GitHub
DraftMarks arXiv
DraftMarks: Enhancing Transparency in Human-AI Co-Writing Through Interactive Skeuomorphic Process Traces
arXiv 2025 - -
XtraGPT arXiv
XtraGPT: Context-Aware and Controllable Academic Paper Revision
arXiv 2025 - -
PaperDebugger arXiv
PaperDebugger: A Plugin-Based Multi-Agent System for In-Editor Academic Writing, Review, and Editing
arXiv 2025 - GitHub
LimAgents arXiv
Multi-Agent LLMs for Generating Research Limitations
arXiv 2026 - -

Fully Automated Paper Generation

In chronological order, from the earliest to the latest.

Model Paper Venue Website GitHub
CycleResearcher arXiv
CycleResearcher: Improving Automated Research via Automated Review
ICLR 2025 Website -
FutureGen arXiv
FutureGen: A RAG-based Approach to Generate the Future Work of Scientific Article
arXiv 2025 - -
APRES arXiv
APRES: An Agentic Paper Revision and Evaluation System
arXiv 2026 - -

Societal Analysis

In chronological order, from the earliest to the latest.

Model Paper Venue Website GitHub
- arXiv
Artificial Intelligence Tools Expand Scientists' Impact but Contract Science's Focus
Nature 2026 Website -
Nature AI Survey arXiv
More than Half of Researchers Now Use AI for Peer Review
Nature 2026 Website -

Writing Quality and AI Detection Assessment

In chronological order, from the earliest to the latest.

Model Paper Venue Website GitHub
Mapping LLM Use arXiv
Mapping the Increasing Use of LLMs in Scientific Papers
arXiv 2024 - -
CycleReviewer arXiv
CycleResearcher: Improving Automated Research via Automated Review
ICLR 2025 Website -
Stanford Agentic - Web 2025 Website -
SciIG arXiv
Let's Use ChatGPT To Write Our Paper! Benchmarking LLMs To Write the Introduction of a Research Paper
arXiv 2025 - -
Watermarking arXiv
Detecting LLM-Generated Peer Reviews
arXiv 2025 - -
PaperWritingBench arXiv
PaperOrchestra: A Multi-Agent Framework for Automated AI Research Paper Writing
arXiv 2026 - -

8. Dissemination (Paper2X)

⏲️ In chronological order, from the earliest to the latest.

Slides & Presentations

⏲️ In chronological order, from the earliest to the latest.

Model Paper Venue Website GitHub
DOC2PPT arXiv AAAI 2022 Website -
AutoPresent arXiv CVPR 2025 Website -
PASS arXiv
PASS: Presentation Automation for Slide Generation and Speech
arXiv 2025 - -
Talk to Your Slides arXiv
Talk to Your Slides: Efficient Slide Editing Agent
arXiv 2025 - -
PPTAgent arXiv
PPTAgent: Generating and Evaluating Presentations Beyond Text-to-Slides
EMNLP 2025 Website GitHub
Auto-Slides arXiv
Auto-Slides: An Interactive Multi-Agent System for Creating and Customizing Research Presentations
arXiv 2025 Website -
SlideGen arXiv
SlideGen: Collaborative Multimodal Agents for Scientific Slide Generation
arXiv 2025 - -
Paper2Slides - GitHub 2025 - GitHub
SlideTailor arXiv
SlideTailor: Personalized Presentation Slide Generation for Scientific Papers
AAAI 2026 Website GitHub
DeepPresenter arXiv
DeepPresenter: Environment-Grounded Reflection for Agentic Presentation Generation
arXiv 2026 - GitHub
Office Raccoon (SenseTime) - Web 2026 Website -

Posters

⏲️ In chronological order, from the earliest to the latest.

Model Paper Venue Website GitHub
P2P (Paper-to-Poster) arXiv
P2P: Automated Paper-to-Poster Generation and Fine-Grained Benchmark
arXiv 2025 - -
Paper2Poster arXiv
Paper2Poster: Towards Multimodal Poster Automation from Scientific Papers
arXiv 2025 - GitHub
PosterGen arXiv
PosterGen: Aesthetic-Aware Paper-to-Poster Generation via Multi-Agent LLMs
arXiv 2025 Website -
PosterForest arXiv
PosterForest: Hierarchical Multi-Agent Collaboration for Scientific Poster Generation
arXiv 2025 - -
APEX arXiv
APEX: Academic Poster Editing Agentic Expert
arXiv 2026 - GitHub
PosterOmni arXiv
PosterOmni: Generalized Artistic Poster Creation via Task Distillation and Unified Reward Feedback
arXiv 2026 - -

Video & Web

⏲️ In chronological order, from the earliest to the latest.

Model Paper Venue Website GitHub
Preacher arXiv
Preacher: Paper-to-Video Agentic System
ICCV 2025 Website GitHub
Paper2Video arXiv
Paper2Video: Automatic Video Generation from Scientific Papers
arXiv 2025 Website GitHub
PresentAgent arXiv
PresentAgent: Multimodal Agent for Presentation Video Generation
arXiv 2025 - GitHub
Paper2Web arXiv
Paper2Web: Let's Make Your Paper Alive!
arXiv 2025 - GitHub

Fidelity and Adoption Assessment

In chronological order, from the earliest to the latest.

Model Paper Venue Website GitHub
PPTEval arXiv
PPTAgent: Generating and Evaluating Presentations Beyond Text-to-Slides
EMNLP 2025 - GitHub
PresentQuiz arXiv
Paper2Video: Automatic Video Generation from Scientific Papers
arXiv 2025 - GitHub
PresentEval arXiv
PresentAgent: Multimodal Agent for Presentation Video Generation
arXiv 2025 - GitHub

9. End-to-End Systems

Fully Automated Research Systems

In chronological order, from the earliest to the latest.

Model Paper Venue Website GitHub
ResearchTown arXiv
ResearchTown: Simulator of Human Research Community
ICML 2025 Website GitHub
Agent Laboratory arXiv
Agent Laboratory: Using LLM Agents as Research Assistants
arXiv 2025 - -
AgentRxiv arXiv
AgentRxiv: Towards Collaborative Autonomous Research
arXiv 2025 - -
ARIS - GitHub 2025 - GitHub
freephdlabor arXiv
Build Your Personalized Research Group: A Multiagent Framework for Continual and Interactive Science Automation
arXiv 2025 - -
SciMaster arXiv
SciMaster: Towards General-Purpose Scientific AI Agents
arXiv 2025 - GitHub
- arXiv
Towards End-to-End Automation of AI Research
Nature 2026 Website GitHub
Idea2Story arXiv
Idea2Story: An Automated Pipeline for Transforming Research Concepts into Complete Scientific Narratives
arXiv 2026 - -
UniScientist - Web 2026 Website -
ASI-Evolve - GitHub 2026 - GitHub
FARS - Web 2026 Website -
AutoResearchClaw - GitHub 2026 - GitHub
CORAL arXiv
CORAL: Towards Autonomous Multi-Agent Evolution for Open-Ended Discovery
arXiv 2026 - GitHub
AutoSOTA arXiv
AutoSOTA: An End-to-End Automated Research System for State-of-the-Art AI Model Discovery
arXiv 2026 - GitHub
AiScientist-LH arXiv
Toward Autonomous Long-Horizon Engineering for ML Research
arXiv 2026 - -
OpenResearcher (2026) arXiv
OpenResearcher: A Fully Open Pipeline for Long-Horizon Deep Research Trajectory Synthesis
arXiv 2026 - GitHub
Aletheia arXiv
Towards Autonomous Mathematics Research
arXiv 2026 - GitHub

Domain-Specific Systems

In chronological order, from the earliest to the latest.

Model Paper Venue Website GitHub
AlphaFold 3 arXiv
Accurate Structure Prediction of Biomolecular Interactions with AlphaFold 3
Nature 2024 Website -
Medical AI Scientist arXiv
Towards a Medical AI Scientist
arXiv 2026 - -

Evolutionary & Self-Improving Systems

In chronological order, from the earliest to the latest.

Model Paper Venue Website GitHub
ShinkaEvolve arXiv
ShinkaEvolve: Towards Open-Ended and Sample-Efficient Program Evolution
arXiv 2025 - GitHub
Darwin Godel Machine arXiv
Darwin Godel Machine: Open-Ended Evolution of Self-Improving Agents
arXiv 2025 - GitHub

Research Platforms & Infrastructure

In chronological order, from the earliest to the latest.

Model Paper Venue Website GitHub
Towards an AI co-scientist arXiv
Towards an AI co-scientist
arXiv 2025 - -
PiFlow arXiv
PiFlow: Principle-aware Scientific Discovery with Multi-Agent Collaboration
arXiv 2025 - -
LabClaw - Web 2026 Website -
- arXiv
OpenAI Is Throwing Everything into Building a Fully Automated Researcher
MIT TR 2026 Website -

10. Societal & Critical Perspectives

⏲️ In chronological order, from the earliest to the latest.

Model Paper Venue Website GitHub
- arXiv
Navigating the Jagged Technological Frontier
Org. Sci. 2025 Website -
- arXiv
Reassessing Academic Integrity in the Age of AI
SSH Open 2025 Website -
The AI Deskilling Paradox arXiv
The AI Deskilling Paradox
CACM 2025 Website -
Hidden Pitfalls of AI Scientist Systems arXiv
The More You Automate, the Less You See: Hidden Pitfalls of AI Scientist Systems
arXiv 2025 - -
Rethinking Science in the Age of AI arXiv
Rethinking Science in the Age of Artificial Intelligence
arXiv 2025 - -
- arXiv
Measuring AI Ability to Complete Long Tasks
METR 2025 Website -
- arXiv
Towards a Science of Scaling Agent Systems
arXiv 2025 - -
- arXiv
Artificial Intelligence Tools Expand Scientists' Impact but Contract Science's Focus
Nature 2026 Website -
- [arXiv](https://www.cell.com/patterns/fulltext/S2666-3899(25)
AI for Scientific Discovery is a Social Problem
Patterns 2026 [Website](https://www.cell.com/patterns/fulltext/S2666-3899(25) -
Research Integrity in the Age of AI arXiv
Research Integrity and Academic Authority in the Age of Artificial Intelligence: From Discovery to Curation?
arXiv 2026 - -
SciSciGPT arXiv
SciSciGPT: Advancing Human-AI Collaboration in the Science of Science
Nature CS 2026 Website -
SimStep arXiv
SimStep: Chain-of-Abstractions for Incremental Specification and Debugging of AI-Generated Interactive Simulations
arXiv 2025 - -
ConvoLearn arXiv
ConvoLearn: A Learning Sciences Grounded Dataset for Fine-Tuning Dialogic AI Tutors
arXiv 2026 - -
AFIM: Academic Fraud Inclination Metric arXiv
AFIM: Academic Fraud Inclination Metric
Web 2026 Website -
- arXiv
AI Researchers' Views on Automating AI R&D and Intelligence Explosions
arXiv 2026 - -
- arXiv
AI Scientists Are Changing Research
Nature 2026 Website -
Learning by Creating (Talk) arXiv
Learning by Creating: A Human-Centered Vision for AI in Education
Talk 2026 Website -

11. Surveys & Curated Lists

⏲️ In chronological order, from the earliest to the latest.

Model Paper Venue Website GitHub
LLM4SR arXiv
LLM4SR: A Survey on Large Language Models for Scientific Research
arXiv 2025 - -
From Automation to Autonomy arXiv
From Automation to Autonomy: A Survey on Large Language Models for Scientific Discovery
arXiv 2025 - -
AI4Research arXiv
AI4Research: A Survey of Artificial Intelligence for Scientific Research
arXiv 2025 - -
A Survey of AI Scientists arXiv
A Survey of AI Scientists
arXiv 2025 - -
- arXiv
Large Language Models for Scientific Idea Generation: A Creativity-Centered Survey
arXiv 2025 - -
- arXiv
Large Language Models for Automated Scholarly Paper Review: A Survey
Inf. Fusion 2025 Website -

12. Tools & GitHub Repos

Open-source tools, frameworks, and curated resource lists for AI-assisted research (not directly tied to a single paper).

Curated Lists

Repository Stars Description
Awesome-Deep-Research ~684 Up-to-date collection of agentic deep research resources
Awesome-Scientific-Language-Models ~647 Survey of scientific LLMs (EMNLP'24)
Awesome-LLM-Scientific-Discovery ~319 Three-level autonomy framework (EMNLP'25)
Awesome-AI-Scientist-Papers ~136 Resources on AI Scientist systems
Awesome-Auto-Research-Tools ~129 Automated research tools catalog
awesome-autoresearch β€” Autonomous improvement loops and research agents

Idea Generation

Repository Stars Description
Virtual-Scientists ~66 VirSci: multi-agent collaborative idea generation (ACL'25)
ResearchAgent ~29 Iterative idea proposal with reviewing agents

Literature Review

Repository Stars Description
paper-qa ~8,300 PaperQA2: superhuman RAG for scientific Q&A
local-deep-research ~4,200 Fully local deep research
researchgpt ~3,500 Conversational interaction with research papers
gpt-researcher β€” Autonomous agent for comprehensive online research
AutoSurvey ~462 Automated comprehensive literature surveys
storm β€” Wikipedia-style article generation (STORM)

Coding & Experiments

Repository Stars Description
autoresearch (Karpathy) ~55,400 Autonomous ML experiments, ~12 exp/hour overnight
Paper2Code ~4,300 Multi-agent ML paper to code transformation
RD-Agent β€” Microsoft's LLM framework for autonomous data science
MLAgentBench ~334 13 end-to-end ML experimentation tasks
SWE-bench β€” Real-world GitHub issue resolution benchmark

Peer Review

Repository Stars Description
paper-reviewer ~824 arXiv paper reviews + blog posts
ai-peer-review ~123 Multi-LLM reviews + meta-review synthesis
openreviewer ~9 Llama-8B fine-tuned on 79K expert reviews

⬆ Back to Top

Last updated: 2026-04-26 Β· Maintained by WorldBench

Releases

No releases published

Packages

 
 
 

Contributors