Skip to content

Mxfp8 and nvfp4#2700

Merged
awni merged 1 commit intoml-explore:mxfp8_and_nvfp4from
awni:mxfp8_and_nvfp4
Oct 23, 2025
Merged

Mxfp8 and nvfp4#2700
awni merged 1 commit intoml-explore:mxfp8_and_nvfp4from
awni:mxfp8_and_nvfp4

Conversation

@awni
Copy link
Copy Markdown
Member

@awni awni commented Oct 23, 2025

As title.

@awni awni merged commit d7e3ad1 into ml-explore:mxfp8_and_nvfp4 Oct 23, 2025
0 of 2 checks passed
awni pushed a commit that referenced this pull request Oct 27, 2025
awni pushed a commit that referenced this pull request Oct 28, 2025
* Add quantize/dequantize slow path for mxfp8 and nvfp4

* fast cuda kernel for mx/nv quantization

* fallback for cuda < 12.8 (#2697)

* format (#2700)

* fix (#2701)

* metal kernels

* docs

* fix jit

* add default bits and group sizes

* improve quant docs

* fix output type of mxfp4 matmuls
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant