Mxfp8 and nvfp4 by awni · Pull Request #2700 · ml-explore/mlx

awni · 2025-10-23T19:24:16Z

As title.

* Add quantize/dequantize slow path for mxfp8 and nvfp4 * fast cuda kernel for mx/nv quantization * fallback for cuda < 12.8 (#2697) * format (#2700) * fix (#2701) * metal kernels * docs * fix jit * add default bits and group sizes * improve quant docs * fix output type of mxfp4 matmuls

format

9432603

awni force-pushed the mxfp8_and_nvfp4 branch from ac80bc9 to 9432603 Compare October 23, 2025 19:24

awni merged commit d7e3ad1 into ml-explore:mxfp8_and_nvfp4 Oct 23, 2025
0 of 2 checks passed

awni pushed a commit that referenced this pull request Oct 27, 2025

format (#2700)

7b34dc3

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Mxfp8 and nvfp4#2700

Mxfp8 and nvfp4#2700
awni merged 1 commit intoml-explore:mxfp8_and_nvfp4from
awni:mxfp8_and_nvfp4

awni commented Oct 23, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

awni commented Oct 23, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant