Commit f9cd034
authored
Feature: Support non-gated activation in cutlass fused MoE nvfp4 (#2011)
## π Description
This PR removes an assertion in the cutlass fused moe bindings to enable
non-gated activations in nvfp4.
It also adds a test for this path with relu2 activation.
## π Related Issues
N/A
## π Pull Request Checklist
Thank you for contributing to FlashInfer! Before we review your pull
request, please make sure the following items are complete.
### β
Pre-commit Checks
- [v] I have installed `pre-commit` by running `pip install pre-commit`
(or used your preferred method).
- [v] I have installed the hooks with `pre-commit install`.
- [v] I have run the hooks manually with `pre-commit run --all-files`
and fixed any reported issues.
> If you are unsure about how to set up `pre-commit`, see [the
pre-commit documentation](https://pre-commit.com/).
## π§ͺ Tests
- [v] Tests have been added or updated as needed.
- [v] All tests are passing (`unittest`, etc.).
## Reviewer Notes
N/A
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
* **New Features**
* Enhanced quantized Mixture of Experts models to support configurable
activation types (Swiglu and ReLU2) in the NVFP4 quantization path.
* Improved parameter handling to correctly adapt weight shapes and
quantization settings based on the selected activation type.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
---------
Signed-off-by: Omer Ullman Argov <[email protected]>1 parent 1181c5d commit f9cd034
File tree
2 files changed
+76
-36
lines changed- csrc/fused_moe/cutlass_backend
- tests/moe
2 files changed
+76
-36
lines changedLines changed: 36 additions & 19 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
361 | 361 | | |
362 | 362 | | |
363 | 363 | | |
364 | | - | |
365 | | - | |
| 364 | + | |
| 365 | + | |
366 | 366 | | |
367 | 367 | | |
368 | 368 | | |
| |||
542 | 542 | | |
543 | 543 | | |
544 | 544 | | |
545 | | - | |
546 | | - | |
| 545 | + | |
| 546 | + | |
547 | 547 | | |
548 | 548 | | |
549 | 549 | | |
| |||
809 | 809 | | |
810 | 810 | | |
811 | 811 | | |
812 | | - | |
813 | | - | |
814 | | - | |
| 812 | + | |
| 813 | + | |
| 814 | + | |
| 815 | + | |
815 | 816 | | |
816 | 817 | | |
817 | 818 | | |
| |||
1013 | 1014 | | |
1014 | 1015 | | |
1015 | 1016 | | |
1016 | | - | |
1017 | | - | |
1018 | | - | |
1019 | | - | |
1020 | | - | |
1021 | | - | |
1022 | | - | |
1023 | | - | |
1024 | | - | |
1025 | | - | |
1026 | | - | |
1027 | | - | |
| 1017 | + | |
| 1018 | + | |
| 1019 | + | |
| 1020 | + | |
| 1021 | + | |
| 1022 | + | |
| 1023 | + | |
| 1024 | + | |
| 1025 | + | |
| 1026 | + | |
| 1027 | + | |
| 1028 | + | |
| 1029 | + | |
| 1030 | + | |
| 1031 | + | |
| 1032 | + | |
| 1033 | + | |
| 1034 | + | |
| 1035 | + | |
| 1036 | + | |
| 1037 | + | |
| 1038 | + | |
| 1039 | + | |
| 1040 | + | |
| 1041 | + | |
| 1042 | + | |
| 1043 | + | |
| 1044 | + | |
1028 | 1045 | | |
1029 | 1046 | | |
1030 | 1047 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
17 | 17 | | |
18 | 18 | | |
19 | 19 | | |
| 20 | + | |
20 | 21 | | |
21 | 22 | | |
22 | 23 | | |
| |||
137 | 138 | | |
138 | 139 | | |
139 | 140 | | |
140 | | - | |
| 141 | + | |
141 | 142 | | |
142 | 143 | | |
143 | 144 | | |
| |||
147 | 148 | | |
148 | 149 | | |
149 | 150 | | |
| 151 | + | |
| 152 | + | |
| 153 | + | |
| 154 | + | |
| 155 | + | |
| 156 | + | |
| 157 | + | |
| 158 | + | |
| 159 | + | |
| 160 | + | |
| 161 | + | |
| 162 | + | |
| 163 | + | |
| 164 | + | |
| 165 | + | |
| 166 | + | |
150 | 167 | | |
151 | 168 | | |
152 | 169 | | |
153 | | - | |
154 | | - | |
155 | | - | |
156 | | - | |
| 170 | + | |
157 | 171 | | |
158 | 172 | | |
159 | 173 | | |
| |||
363 | 377 | | |
364 | 378 | | |
365 | 379 | | |
| 380 | + | |
| 381 | + | |
| 382 | + | |
| 383 | + | |
| 384 | + | |
366 | 385 | | |
367 | 386 | | |
368 | 387 | | |
| |||
376 | 395 | | |
377 | 396 | | |
378 | 397 | | |
| 398 | + | |
379 | 399 | | |
380 | 400 | | |
381 | 401 | | |
| |||
391 | 411 | | |
392 | 412 | | |
393 | 413 | | |
394 | | - | |
395 | | - | |
| 414 | + | |
| 415 | + | |
396 | 416 | | |
397 | | - | |
| 417 | + | |
398 | 418 | | |
399 | 419 | | |
400 | 420 | | |
| |||
409 | 429 | | |
410 | 430 | | |
411 | 431 | | |
412 | | - | |
413 | | - | |
| 432 | + | |
| 433 | + | |
414 | 434 | | |
415 | 435 | | |
416 | 436 | | |
| |||
424 | 444 | | |
425 | 445 | | |
426 | 446 | | |
427 | | - | |
| 447 | + | |
428 | 448 | | |
429 | 449 | | |
430 | 450 | | |
| |||
469 | 489 | | |
470 | 490 | | |
471 | 491 | | |
| 492 | + | |
472 | 493 | | |
473 | 494 | | |
474 | 495 | | |
| |||
483 | 504 | | |
484 | 505 | | |
485 | 506 | | |
486 | | - | |
| 507 | + | |
487 | 508 | | |
488 | 509 | | |
489 | 510 | | |
| |||
504 | 525 | | |
505 | 526 | | |
506 | 527 | | |
507 | | - | |
508 | | - | |
509 | | - | |
510 | | - | |
511 | 528 | | |
512 | | - | |
| 529 | + | |
| 530 | + | |
| 531 | + | |
| 532 | + | |
| 533 | + | |
| 534 | + | |
| 535 | + | |
513 | 536 | | |
514 | 537 | | |
515 | 538 | | |
| |||
0 commit comments