bugfix: fix the enum/int type mismatch mentioned in #2507 by yzh119 · Pull Request #2508 · flashinfer-ai/flashinfer

yzh119 · 2026-02-06T10:27:57Z

📌 Description

As mentioned in #2507, trtllm_fp8_per_tensor_scale_moe function would fail when passed integer activation_type.
This PR fixes the type mismatch.

🔍 Related Issues

🚀 Pull Request Checklist

Thank you for contributing to FlashInfer! Before we review your pull request, please make sure the following items are complete.

✅ Pre-commit Checks

I have installed pre-commit by running pip install pre-commit (or used your preferred method).
I have installed the hooks with pre-commit install.
I have run the hooks manually with pre-commit run --all-files and fixed any reported issues.

If you are unsure about how to set up pre-commit, see the pre-commit documentation.

🧪 Tests

Tests have been added or updated as needed.
All tests are passing (unittest, etc.).

Reviewer Notes

Summary by CodeRabbit

Refactor
- Public APIs updated: activation_type now accepts integer values (defaults adjusted to numeric activation codes).
- Call sites and public function signatures aligned to use the numeric activation_type.
Tests
- Test inputs updated to supply numeric activation_type values instead of enum members.

gemini-code-assist · 2026-02-06T10:28:12Z

Summary of Changes

Hello @yzh119, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request resolves a reported bug concerning a type mismatch within the trtllm_fp8_per_tensor_scale_moe function. By ensuring the activation_type parameter is correctly cast to its expected enum type, the change prevents runtime failures and enhances the function's reliability when handling different input types.

Highlights

Type Mismatch Fix: Corrected an issue in the trtllm_fp8_per_tensor_scale_moe function where passing an integer activation_type would cause a failure. The fix involves explicitly casting the activation_type parameter to ActivationType before passing it to the underlying implementation.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog

flashinfer/fused_moe/core.py
- Modified the call to _trtllm_fp8_per_tensor_scale_moe_impl to explicitly wrap the activation_type argument with ActivationType() to resolve a type mismatch.

Activity

The author has confirmed completion of pre-commit checks.
Tests have been added or updated as needed, and all tests are passing.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

coderabbitai · 2026-02-06T10:28:32Z

📝 Walkthrough

Walkthrough

Replaced enum-typed ActivationType parameters with integer activation_type (using .value) across fused MoE public APIs and call sites in flashinfer/fused_moe/core.py, and updated tests to pass enum .value integers accordingly.

Changes

Cohort / File(s)	Summary
Fused MoE core `flashinfer/fused_moe/core.py`	Changed public function/constructor signatures and call sites to accept `activation_type: int` (wired to `ActivationType.Swiglu.value`) instead of `ActivationType`; replaced uses of `activation_type.value` with `activation_type` when forwarding to C++/autotuning paths.
Tests `tests/moe/test_trtllm_gen_fused_moe.py`	Updated parametrization to pass `ActivationType` enum `.value` integers instead of enum members so tests match new int-based API expectations.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

feat: Support Fused MoE non gated Relu2 NVFP4 & FP8 and support Nemotron #2304 — touches activation-type propagation and fused MoE changes similar to this PR.
Feature: Support non-gated activation in cutlass fused MoE nvfp4 #2011 — adjusts activation_type propagation/representation across fused MoE call sites.
feat: Support Fused MoE non gated Relu2 NVFP4 & FP8 and support Nemotron, fixed #2462 — related work replacing/renaming gated activation enums and threading activation_type through APIs.

Suggested labels

run-ci

Suggested reviewers

djmmoss
cyx-6
jiahanc
nv-yunzheq
IwakuraRein

Poem

🐰 I hopped through code both day and night,
Replaced the enum with a number light,
Values travel clean, no wrapper in sight,
The fences lowered, the logic tight.

🚥 Pre-merge checks | ✅ 2 | ❌ 2

❌ Failed checks (2 warnings)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 57.14% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.
Merge Conflict Detection	⚠️ Warning	❌ Merge conflicts detected (43 files): ⚔️ `benchmarks/bench_cute_dsl_blockscaled_gemm.py` (content) ⚔️ `benchmarks/bench_trtllm_gen_fused_moe_autotuner.py` (content) ⚔️ `benchmarks/routines/flashinfer_benchmark_utils.py` (content) ⚔️ `benchmarks/routines/gemm.py` (content) ⚔️ `csrc/flashinfer_sampling_binding.cu` (content) ⚔️ `csrc/gdn_prefill_launcher.cu` (content) ⚔️ `csrc/nv_internal/tensorrt_llm/kernels/communicationKernels/moeAlltoAllKernels.cu` (content) ⚔️ `csrc/sampling.cu` (content) ⚔️ `csrc/trtllm_fmha_kernel_launcher.cu` (content) ⚔️ `docker/Dockerfile.cu126` (content) ⚔️ `docker/Dockerfile.cu128` (content) ⚔️ `docker/Dockerfile.cu129` (content) ⚔️ `docker/Dockerfile.cu130` (content) ⚔️ `flashinfer/__init__.py` (content) ⚔️ `flashinfer/aot.py` (content) ⚔️ `flashinfer/artifacts.py` (content) ⚔️ `flashinfer/cute_dsl/__init__.py` (content) ⚔️ `flashinfer/cute_dsl/blockscaled_gemm.py` (content) ⚔️ `flashinfer/cute_dsl/utils.py` (content) ⚔️ `flashinfer/decode.py` (content) ⚔️ `flashinfer/fused_moe/__init__.py` (content) ⚔️ `flashinfer/fused_moe/core.py` (content) ⚔️ `flashinfer/gemm/__init__.py` (content) ⚔️ `flashinfer/gemm/gemm_base.py` (content) ⚔️ `flashinfer/jit/__init__.py` (content) ⚔️ `flashinfer/jit/gemm/__init__.py` (content) ⚔️ `flashinfer/jit/gemm/core.py` (content) ⚔️ `flashinfer/mla.py` (content) ⚔️ `flashinfer/prefill.py` (content) ⚔️ `flashinfer/sampling.py` (content) ⚔️ `flashinfer/triton/__init__.py` (content) ⚔️ `flashinfer/utils.py` (content) ⚔️ `include/flashinfer/sampling.cuh` (content) ⚔️ `include/flashinfer/trtllm/fmha/fmhaKernels.cuh` (content) ⚔️ `include/flashinfer/trtllm/fmha/fmhaRunnerParams.h` (content) ⚔️ `include/flashinfer/trtllm/fmha/kernelParams.h` (content) ⚔️ `scripts/authorized_codeowner.txt` (content) ⚔️ `scripts/task_run_unit_tests.sh` (content) ⚔️ `scripts/test_utils.sh` (content) ⚔️ `tests/attention/test_trtllm_gen_attention.py` (content) ⚔️ `tests/gemm/test_bmm_fp8.py` (content) ⚔️ `tests/gemm/test_cute_dsl_blockscaled_gemm.py` (content) ⚔️ `tests/moe/test_trtllm_gen_fused_moe.py` (content) These conflicts must be resolved before merging into `main`.	Resolve conflicts locally and push changes to this branch.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title clearly and concisely identifies the main purpose: fixing an enum/int type mismatch in a specific function, with reference to the related issue `#2507`.
Description check	✅ Passed	The PR description addresses the core issue, provides context from the related issue, and completes most checklist items. However, 'All tests are passing' is explicitly unchecked despite test updates being made.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

⚔️ Resolve merge conflicts (beta)

Auto-commit resolved conflicts to branch hotfix-2507
Post resolved changes as copyable diffs in a comment

No actionable comments were generated in the recent review. 🎉

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

gemini-code-assist

Code Review

This pull request addresses a type mismatch issue in the trtllm_fp8_per_tensor_scale_moe function where an integer activation_type was passed to a function expecting an ActivationType enum. The change correctly wraps the integer value with the ActivationType enum constructor, resolving the bug. The fix is concise and accurate. I've reviewed other related functions and they do not appear to have the same issue. The change is approved.

Signed-off-by: Dimitrios Bariamis <[email protected]>

…others Signed-off-by: Dimitrios Bariamis <[email protected]>

dbari · 2026-02-06T12:01:50Z

I adapted the tests to match the function signature and made it consistent with the fp4 functions here:
https://github.com/dbari/flashinfer/tree/hotfix-2507

Feel free to include it in this PR, I didn't want to open a new one for the same thing to avoid confusion.

aleozlx

lgtm

IwakuraRein · 2026-02-12T17:59:30Z

trtllm_fp4_block_scale_moe_op has activation_type: int = ActivationType.Swiglu.value,. Maybe it's better to unify trtllm_fp4_block_scale_moe_op and trtllm_fp8_per_tensor_scale_moe

yzh119 · 2026-02-13T21:05:22Z

Hi @dbari I have merged your commits.

Maybe it's better to unify trtllm_fp4_block_scale_moe_op and trtllm_fp8_per_tensor_scale_moe

@IwakuraRein Can you work on a follow up PR? This one might be merged into v0.6.4 as a hotfix.

upd

2c38bf6

yzh119 requested review from cyx-6, djmmoss, jiahanc and nv-yunzheq as code owners February 6, 2026 10:27

gemini-code-assist bot reviewed Feb 6, 2026

View reviewed changes

dbari added 2 commits February 6, 2026 03:36

Tests provide activation_type as int, matching the function signature

d6ed3f8

Signed-off-by: Dimitrios Bariamis <[email protected]>

Make signature of trtllm_fp8_per_tensor_scale_moe_op consistent with …

c4fab8f

…others Signed-off-by: Dimitrios Bariamis <[email protected]>

dbari mentioned this pull request Feb 6, 2026

Fix RoutingMethodType logic vllm-project/vllm#33919

Merged

5 tasks

aleozlx approved these changes Feb 9, 2026

View reviewed changes

yzh119 added the v0.6.4 release blocker label for v0.6.4 label Feb 13, 2026

yzh119 merged commit f4d10a7 into flashinfer-ai:main Feb 14, 2026
28 checks passed

dbari mentioned this pull request Feb 18, 2026

[Bug] AttributeError in trtllm_fp8_per_tensor_scale_moe in 0.6.3 #2507

Closed

aleozlx mentioned this pull request Feb 19, 2026

tests: add bias testing to nvfp4 moe #2585

Merged

5 tasks

coderabbitai bot mentioned this pull request Feb 25, 2026

benchmark: Add MXFP4/MXFP8 quantization mode support to FP4 MoE benchmark #2635

Merged

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

bugfix: fix the enum/int type mismatch mentioned in #2507#2508

bugfix: fix the enum/int type mismatch mentioned in #2507#2508
yzh119 merged 3 commits intoflashinfer-ai:mainfrom
yzh119:hotfix-2507

yzh119 commented Feb 6, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

gemini-code-assist bot commented Feb 6, 2026

Uh oh!

coderabbitai bot commented Feb 6, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested labels

Suggested reviewers

Poem

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

dbari commented Feb 6, 2026

Uh oh!

aleozlx left a comment

Uh oh!

IwakuraRein commented Feb 12, 2026 •

edited

Loading

Uh oh!

yzh119 commented Feb 13, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

yzh119 commented Feb 6, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

📌 Description

🔍 Related Issues

🚀 Pull Request Checklist

✅ Pre-commit Checks

🧪 Tests

Reviewer Notes

Summary by CodeRabbit

Uh oh!

gemini-code-assist bot commented Feb 6, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

coderabbitai bot commented Feb 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested labels

Suggested reviewers

Poem

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

dbari commented Feb 6, 2026

Uh oh!

aleozlx left a comment

Choose a reason for hiding this comment

Uh oh!

IwakuraRein commented Feb 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

yzh119 commented Feb 13, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

yzh119 commented Feb 6, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Feb 6, 2026 •

edited

Loading

IwakuraRein commented Feb 12, 2026 •

edited

Loading