fallback for cuda < 12.8 by awni · Pull Request #2697 · ml-explore/mlx

awni · 2025-10-23T16:43:25Z

As title.

* Add quantize/dequantize slow path for mxfp8 and nvfp4 * fast cuda kernel for mx/nv quantization * fallback for cuda < 12.8 (#2697) * format (#2700) * fix (#2701) * metal kernels * docs * fix jit * add default bits and group sizes * improve quant docs * fix output type of mxfp4 matmuls

fallback for cuda < 12.8

ea7fbe0

awni merged commit 8be324c into ml-explore:mxfp8_and_nvfp4 Oct 23, 2025
1 check was pending

awni pushed a commit that referenced this pull request Oct 27, 2025

fallback for cuda < 12.8 (#2697)

5c05167

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fallback for cuda < 12.8#2697

fallback for cuda < 12.8#2697
awni merged 1 commit intoml-explore:mxfp8_and_nvfp4from
awni:mxfp8_and_nvfp4

awni commented Oct 23, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

awni commented Oct 23, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant