Support FP8 quantize #212

zhiyuan1i · 2025-05-06T09:58:46Z

This PR makes RWKV7 support FP8 format quantization, and the actual matmul computation is still FP16 or BF16, which is determined by the _scaled_mm kernel of PyTorch. The kernel broadly supports SM75 and above devices, so I think it's reasonable to replace int8 quantized support.

zhiyuan1i marked this pull request as draft May 6, 2025 10:05

zhiyuan1i force-pushed the zhiyuan1i-fp8 branch from ed08817 to 8c2846b Compare May 6, 2025 10:10

zhiyuan1i marked this pull request as ready for review May 6, 2025 10:18

Support FP8 quantize

b629642

zhiyuan1i force-pushed the zhiyuan1i-fp8 branch from 8c2846b to b629642 Compare May 6, 2025 10:32

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Support FP8 quantize #212

Support FP8 quantize #212

Uh oh!

zhiyuan1i commented May 6, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Support FP8 quantize #212

Are you sure you want to change the base?

Support FP8 quantize #212

Uh oh!

Conversation

zhiyuan1i commented May 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

zhiyuan1i commented May 6, 2025 •

edited

Loading