Scaling Laws for Upcycling Mixture-of-Experts Language Models

This is the official repository for our ICML'25 paper Scaling Laws for Upcycling Mixture-of-Experts Language Models, containing code and data to reproduce analyses of the paper.

Structure

data: contains the data obtained from our scaling law experiments.
- data/result_8x.txt: results for training Mixtral-like MoE from scratch.
- data/result.txt: results for training dense LLM from scratch.
- data/result_upcycle_8x_topk_2.txt: results for upcycling Mixtral-like MoE from scratch.
- data/sparsity.csv: experimental data for fitting the sparsity-active parameter scaling law.
- data/ablate*: results for various ablation studies.
analysis.ipynb: contains example fitting the joint scaling law for Mixtral-like MoE.
analyze_sparsity.ipynb: contains example fitting the sparsity-active parameter scaling law.

License

This implementation is licensed under the Apache License 2.0.

Citation

If you find this work helpful, please consider citing our paper:

@inproceedings{liew2025scaling,
  title = {Scaling Laws for Upcycling Mixture-of-Experts Language Models},
  booktitle = {Forty-Second International Conference on Machine Learning},
  author = {Liew, Seng Pei and Kato, Takuya and Takase, Sho},
  year = {2025}
}

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
data		data
LICENSE		LICENSE
README.md		README.md
analysis.ipynb		analysis.ipynb
analyze_sparsity.ipynb		analyze_sparsity.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Scaling Laws for Upcycling Mixture-of-Experts Language Models

Structure

License

Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Scaling Laws for Upcycling Mixture-of-Experts Language Models

Structure

License

Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages