AdaMuon

This is the official repository for the paper AdaMuon: Adaptive Muon Optimizer.

Introduction

AdaMuon is an effective optimizer based on Muon. It can achieve more than 40% training efficiency compared to AdamW.

Quick Start

This repository contains two projects: one is the GPT-2 experiments, and the other is the open-sourced Megatron-LM code, which we included to facilitate large-scale experiments.

To use AdaMuon in your own training pipeline on other architectures and datasets, use the following pseudo code as an example:

from opt_config import configure_optimizers

# Model
model = Model()

# Optimizer
optimizer = configure_optimizers(model.parameters(), weight_decay=0.1, learning_rate=6e-4)

# Training
for epoch in range(epochs):
    for X, Y in data_loader:
        # standard training code
        logits, loss = model(X, Y)
        loss.backward()
        # ...

Performance

License

This repository is licensed under the Apache 2.0 license. See the LICENSE file for more details.

Citation

@article{si2025adamuon,
  title={AdaMuon: Adaptive Muon Optimizer},
  author={Si, Chongjie and Zhang, Debing and Shen, Wei},
  journal={arXiv preprint arXiv:2507.11005},
  year={2025}
}

Contact

If you have any questions, please raise an issue or contact us at [email protected].

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
gpt2		gpt2
megatron		megatron
src		src
.DS_Store		.DS_Store
LICENSE		LICENSE
README.md		README.md
adamuon.py		adamuon.py
opt_config.py		opt_config.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AdaMuon

Introduction

Quick Start

Performance

License

Citation

Contact

About

Uh oh!

Releases

Packages

Languages

License

Chongjie-Si/AdaMuon

Folders and files

Latest commit

History

Repository files navigation

AdaMuon

Introduction

Quick Start

Performance

License

Citation

Contact

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages