MDAgent2: Large Language Model for Code Generation and Knowledge Q&A in Molecular Dynamics

News

[2026-04-11] Code and partial benchmark released. Welcome to try it out!

Note: Full code and complete benchmark will be released upon paper acceptance.

Introduction

We present MDAgent2, the first end-to-end framework capable of performing both knowledge Q&A and code generation within the Molecular Dynamics (MD) domain. Our key contributions include:

A domain-specific data-construction pipeline yielding three high-quality datasets spanning MD knowledge, question answering, and code generation.
A three-stage post-training strategy — continued pre-training (CPT), supervised fine-tuning (SFT), and reinforcement learning (RL) — to train two domain-adapted models: MD-Instruct and MD-Code.
MD-GRPO, a closed-loop RL method that leverages simulation outcomes as reward signals and recycles low-reward trajectories for continual refinement.
MDAgent2-RUNTIME, a deployable multi-agent system integrating code generation, execution, evaluation, and self-correction.
MD-EvalBench, the first benchmark for LAMMPS code generation and question answering.

Performance

Our models and system surpass several strong baselines on MD-EvalBench, demonstrating the adaptability and generalization capability of LLMs in industrial simulation tasks.

Architecture

Three-Stage Training Strategy

CPT: Domain adaptation through continued pre-training on MD-specific corpus
SFT: Fine-tuning on high-quality instruction-following and code generation datasets
RL (MD-GRPO): Closed-loop reinforcement learning using simulation feedback as reward signals, with low-reward trajectory recycling

MD-GRPO

MDAgent2-RUNTIME

A deployable multi-agent system integrating code generation, execution, evaluation, and self-correction in a closed loop.

MD-EvalBench

MD-EvalBench is the first benchmark for LAMMPS code generation and question answering, evaluating:

Code Generation: Generating executable LAMMPS scripts from natural language
Knowledge Q&A: Answering domain-specific questions about molecular dynamics
Executability: Ensuring generated code runs successfully in simulation environments

Benchmark Data Decryption

To prevent benchmark data from being crawled for LLM training, answer fields in MD_Benchmark/ are base64-encoded.

To decode the answer fields for evaluation:

python decrypt_benchmark.py

No password is needed. This restores the answer and answer_text fields in-place. The files remain valid JSON/JSONL throughout.

Note: Please do not publicly redistribute the decoded benchmark data.

How To Run

Two Agent Implementations

LangGraph Version:

from LammpsAgents_by_langgraph import run_lammps_agents

final_state = run_lammps_agents("Simulate the thermal expansion of copper", is_delete_dir=True)

Autogen Version:

from LammpsAgents_by_autogen import run_lammps_agents

final_state = run_lammps_agents("Simulate the thermal expansion of copper", is_delete_dir=True)

Environment Setup

Linux (Recommended)

Python 3.11 recommended (preferably via conda)

Install CUDA:

conda install cudatoolkit cuda-version=11

Install LAMMPS:

Option A — conda (simple but limited):

conda install lammps -c conda-forge
conda install openkim-models -c conda-forge

Option B — Build from source (full control):

git clone https://github.com/lammps/lammps.git
cd lammps && mkdir build && cd build
cmake ../cmake \
  -DCMAKE_BUILD_TYPE=Release \
  -DBUILD_MPI=ON \
  -DBUILD_SHARED_LIBS=ON \
  -DPKG_MISC=ON \
  -DPKG_KSPACE=ON \
  -DPKG_MOLECULE=ON \
  -DPKG_USER-MISC=ON \
  -DPKG_EAM=ON \
  -DPKG_MANYBODY=ON
make -j$(nproc)

Install Python dependencies:

pip install uv
uv pip install -r requirements.txt

Configure environment variables:

cp .env-EXAMPLE .env
# Edit .env and set OPENAI_API_KEY=...

(Optional) Install PyTorch for local models:
```
pip3 install torch torchvision torchaudio
```

Windows

Python 3.11 recommended (preferably via conda)

Install CUDA:

conda install cudatoolkit cuda-version=11

Install LAMMPS following the official guide

Install Python dependencies:

pip install uv
uv pip install -r requirements.txt

Configure environment variables:

cp .env-EXAMPLE .env
# Edit .env and set OPENAI_API_KEY=...

(Optional) Install PyTorch:

pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu126

Method 1: Frontend + Backend

Backend: FastAPI — handles agent inference, file management, and visualization
Frontend: Vue3 + Vite — interactive web interface

# Start backend (default port 8000)
python app.py

# Start frontend
cd lammps-frontend
npm install
npm run dev

Frontend: http://localhost:5173 | Backend API docs: http://localhost:8000/docs

Method 2: Docker

# Build
docker build -t lammps-grpo:latest .

# Run
docker run -d --restart=always -p 8000:8000 \
  --env-file .env \
  --name lammps-grpo \
  lammps-grpo:latest

The default image uses python:3.11-slim without GPU/CUDA. Extend the Dockerfile if you need LAMMPS with CUDA inside the container.

Examples

Input:

Simulate the thermal expansion coefficient of copper at 300K under NPT conditions, and output its volume change data.

Generated LAMMPS Script:

units metal
atom_style atomic
lattice fcc 3.615
region box block 0 5 0 5 0 5
create_box 1 box
create_atoms 1 box
mass 1 63.546
pair_style eam
pair_coeff * * potentials/Cu_u3.eam
velocity all create 300.0 12345
log log.lammps
dump 1 all atom 10 dump.lammpstrj
dump_modify 1 sort id
thermo 10
fix 1 all npt temp 300.0 300.0 0.1 iso 0 0 1.0
run 1000

Directory Structure

├── LammpsAgents_by_langgraph.py   # Agent workflow based on LangGraph
├── app.py                         # FastAPI backend service
├── prompt.py                      # System prompts for LAMMPS generation
├── encrypt_benchmark.py           # Encode benchmark answer fields (base64)
├── decrypt_benchmark.py           # Decode benchmark answer fields
├── potentials/                    # LAMMPS potential files
├── train_dataset/                 # Training datasets (examples only)
│   ├── MD-CodeGen/
│   ├── MD-InstructQA/
│   └── MD-Knowledge/
├── MD_Benchmark/                  # Evaluation benchmark
│   ├── ZH/                       # Chinese version
│   │   ├── Code_Eval/
│   │   └── QA_Eval/
│   └── EN/                       # English version
│       ├── Code_Eval/
│       └── QA_Eval/
├── utils/                         # Utilities and APIs
├── lammps-frontend/               # Vue3 frontend
├── lammps_run_example/            # Example LAMMPS outputs
├── pics/                          # Figures
├── demo_video/                    # Demo video
├── requirements.txt               # Python dependencies
├── requirements_grpo.txt          # GRPO training dependencies
├── Dockerfile                     # Docker configuration
├── .env-EXAMPLE                   # Environment variables template
├── README.md                      # Documentation (English)
├── README_CN.md                   # Documentation (Chinese)
└── LICENSE                        # License

Citation

If this work is helpful, please cite:

@misc{shi2026mdagent2large,
      title={MDAgent2: Large Language Model for Code Generation and Knowledge Q&A in Molecular Dynamics}, 
      author={Zhuofan Shi and Hubao A and Yufei Shao and Mengyan Dai and Yadong Yu and Pan Xiang and Dongliang Huang and Hongxu An and Chunxiao Xin and Haiyang Shen and Zhenyu Wang and Yunshan Na and Gang Huang and Xiang Jing},
      year={2026},
      eprint={2601.02075},
      archivePrefix={arXiv},
      primaryClass={cs.CE},
      url={https://arxiv.org/abs/2601.02075}
}

License

This project is licensed under the terms specified in the LICENSE file.

Acknowledgments

We gratefully acknowledge support from all contributors and institutions involved in this research.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MDAgent2: Large Language Model for Code Generation and Knowledge Q&A in Molecular Dynamics

News

Introduction

Performance

Architecture

Three-Stage Training Strategy

MD-GRPO

MDAgent2-RUNTIME

MD-EvalBench

Benchmark Data Decryption

How To Run

Two Agent Implementations

Environment Setup

Linux (Recommended)

Windows

Method 1: Frontend + Backend

Method 2: Docker

Examples

Directory Structure

Citation

License

Acknowledgments

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
MD_Benchmark		MD_Benchmark
lammps-frontend		lammps-frontend
lammps_run_example		lammps_run_example
pics		pics
potentials		potentials
train_dataset		train_dataset
utils		utils
.dockerignore		.dockerignore
.env-EXAMPLE		.env-EXAMPLE
.gitattributes		.gitattributes
.gitignore		.gitignore
.nojekyll		.nojekyll
Dockerfile		Dockerfile
LICENSE		LICENSE
LammpsAgents_by_langgraph.py		LammpsAgents_by_langgraph.py
README.md		README.md
README_CN.md		README_CN.md
app.py		app.py
decrypt_benchmark.py		decrypt_benchmark.py
encrypt_benchmark.py		encrypt_benchmark.py
prompt.py		prompt.py
requirements.txt		requirements.txt
requirements_grpo.txt		requirements_grpo.txt
script.js		script.js
style.css		style.css

Folders and files

Latest commit

History

Repository files navigation

MDAgent2: Large Language Model for Code Generation and Knowledge Q&A in Molecular Dynamics

News

Introduction

Performance

Architecture

Three-Stage Training Strategy

MD-GRPO

MDAgent2-RUNTIME

MD-EvalBench

Benchmark Data Decryption

How To Run

Two Agent Implementations

Environment Setup

Linux (Recommended)

Windows

Method 1: Frontend + Backend

Method 2: Docker

Examples

Directory Structure

Citation

License

Acknowledgments

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages