Source code and model weights for the Nequix foundation model, and Phonon fine-tuning (PFT).
| Model | Dataset | Theory | Reference |
|---|---|---|---|
nequix-mp-1 |
MPtrj | DFT (PBE+U) | Nequix |
nequix-mp-1-pft |
MPtrj, MDR Phonon | DFT (PBE+U) | PFT |
nequix-omat-1 |
OMat24 | DFT (PBE+U, VASP 54) | PFT |
nequix-oam-1 |
OMat24, sAlex, MPtrj | DFT (PBE+U) | PFT |
nequix-oam-1-pft |
OMat24, sAlex, MPtrj, MDR Phonon | DFT (PBE+U) | PFT |
pip install nequixto use OpenEquivariance kernels,
pip install nequix[oeq]
# needs to be run after installation:
pip install openequivariance_extjax --no-build-isolationor for torch (also with kernels):
pip install nequix[torch]Using nequix.calculator.NequixCalculator, you can perform calculations in
ASE with a pre-trained Nequix model.
from nequix.calculator import NequixCalculator
atoms = ...
atoms.calc = NequixCalculator("nequix-mp-1", backend="jax")or if you want to use the torch backend:
...
atoms.calc = NequixCalculator("nequix-mp-1", backend="torch")
...These are typically comparable in speed with kernels.
Analytical Hessians can be calculated with (currently only supported for JAX backend):
calc = NequixCalculator("nequix-mp-1", backend="jax")
calc.get_hessian(atoms) # np array of shape (n, n, 3, 3)Arguments
model_name(str, default "nequix-mp-1"): Pretrained model alias to load or download.model_path(str | Path, optional): Path to local checkpoint; overridesmodel_name.backend({"jax", "torch"}, default "jax"): Compute backend.capacity_multiplier(float, default 1.1): JAX-only; padding factor to limit recompiles.use_compile(bool, default True): Torch-only; on GPU, usestorch.compile().use_kernel(bool, default True): on GPU, use OpenEquivariance kernels.
Models are trained with the nequix_train command using a single .yml
configuration file:
nequix_train <config>.ymlwith kernels
uv sync --extra oeq
uv pip install openequivariance_extjax --no-build-isolation
nequix_train <config>.ymlor for Torch
# Single GPU
uv sync --extra torch
uv run nequix/torch_impl/train.py <config>.yml
# Multi-GPU
uv run torchrun --nproc_per_node=<gpus> nequix/torch_impl/train.py <config>.ymlTo reproduce the training of Nequix-MP-1, first clone the repo and sync the environment:
git clone https://github.com/atomicarchitects/nequix.git
cd nequix
uv syncThen download the MPtrj data from
https://figshare.com/files/43302033 into data/ then run the following to extract the data:
bash data/download_mptrj.shPreprocess the data into .aselmdb files:
uv run scripts/preprocess_data.py data/mptrj-gga-ggapu data/mptrj-aselmdbThen start the training run:
nequix_train configs/nequix-mp-1.ymlThis will take less than 125 hours on a single 4 x A100 node (<25 hours with kernels). The batch_size in the
config is per-device, so you should be able to run this on any number of GPUs
(although hyperparameters like learning rate are often sensitive to global batch
size, so keep in mind).
First sync extra dependencies with
uv sync --extra pftWe provide pretrained model weights for the co-trained (better alignment with
MPtrj) and non co-trained models in models/nequix-mp-1-pft.nqx and
nequix-mp-1-pft-nocotrain.nqx respectively. See nequix-examples/phonon for
examples on how to use these models for phonon calculations with both finite
displacement, and analytical Hessians.
Data for the PBE MDR phonon database was originally downloaded and preprocessed with:
bash data/download_pbe_mdr.sh
uv run data/split_pbe_mdr.py
uv run scripts/preprocess_data_phonopy.py data/pbe-mdr/train data/pbe-mdr/train-aselmdb
uv run scripts/preprocess_data_phonopy.py data/pbe-mdr/val data/pbe-mdr/val-aselmdbHowever we provide preprocessed data which can be downloaded with
bash data/download_pbe_mdr_preprocessed.shTo run PFT without co-training run:
uv run nequix/pft/train.py configs/nequix-mp-1-pft-no-cotrain.ymlTo run PFT with co-training run (note this requires mptrj-aselmdb preprocessed):
uv run nequix/pft/train.py configs/nequix-mp-1-pft.ymlTo run PFT on the OAM base model, follow the data download instructions below and then run:
uv run nequix/pft/train.py configs/nequix-oam-1-pft.ymlBoth PFT training runs take about 140 hours on a single A100. Note that PFT training is only currently only supported with the JAX backend, which is both significantly faster and supported by the kernels. See nequix-examples/pft, which contains a small demo for PFT in PyTorch that can be adapted to other models. Feel free to reach out with questions.
To reproduce our training runs for the OMat and OAM base models run the following. First download OMat and sAlex data:
./data/download_omat.sh <path to storage location>Then symlink to ./data
ln -s <path to storage location>/omat ./data/omat
ln -s <path to storage location>/salex ./data/salex
ln -s <path to storage location>/mptrj-aselmdb ./data/mptrj-aselmdbTo train the OMat model, run:
uv run torchrun --nproc_per_node=4 nequix/torch_impl/train.py configs/nequix-omat-1.ymlThis takes roughly 60 hours on a 4 x A100 node. To fine-tune the OAM model, copy
the OMat model to models/nequix-omat-1.pt and run
uv run torchrun --nproc_per_node=4 nequix/torch_impl/train.py configs/nequix-oam-1.ymlThis takes roughly 10 hours on a 4 x A100 node.
@article{koker2026pft,
title={{PFT}: Phonon Fine-tuning for Machine Learned Interatomic Potentials},
author={Koker, Teddy and Gangan, Abhijeet and Kotak, Mit and Marian, Jaime and Smidt, Tess},
journal={arXiv preprint arXiv:2601.07742},
year={2026}
}
@article{koker2025training,
title={Training a foundation model for materials on a budget},
author={Koker, Teddy and Kotak, Mit and Smidt, Tess},
journal={arXiv preprint arXiv:2508.16067},
year={2025}
}