fix: blockscale moe routine supports non-DS routing by hypdeb · Pull Request #2476 · flashinfer-ai/flashinfer

hypdeb · 2026-02-03T10:52:19Z

📌 Description

The comment at this location previously stated that only float32 routing logits would be possible for blockscale MoE. However, it seems the launcher now supports bfloat16 for non-DS routing.

Summary by CodeRabbit

Bug Fixes

Per-method routing precision handling updated: the model now selects routing tensor precision based on the chosen routing method (e.g., DeepSeekV3 uses higher-precision), replacing fixed per-path precision. This improves routing stability and consistency across routing configurations without changing public APIs.

gemini-code-assist · 2026-02-03T10:52:33Z

Summary of Changes

Hello @hypdeb, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request addresses an inaccuracy in the data type handling for Mixture-of-Experts (MoE) routing logits within the fp8_block_scale kernel. The change ensures that the benchmark routine correctly aligns with the launcher's current capabilities, allowing for bfloat16 usage in specific routing scenarios, thereby improving the fidelity of the benchmark to the actual runtime behavior.

Highlights

Blockscale MoE Routing Logits Data Type Correction: Updated the create_trtllm_moe_test_data function to accurately reflect the required data types for routing logits in fp8_block_scale MoE kernels. Previously, it was assumed that float32 was always necessary, but the routine now correctly uses bfloat16 for non-DeepSeekV3 routing methods, while DeepSeekV3 continues to utilize float32.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog

benchmarks/routines/moe.py
- Modified the create_trtllm_moe_test_data function to introduce conditional logic for the routing_logits data type when moe_kernel_type is fp8_block_scale.
- Implemented a check to set routing_logits to float32 if routing_method_type is 2 (DeepSeekV3), and to bfloat16 otherwise.

Activity

No human activity (comments, reviews, etc.) has been recorded on this pull request yet.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

coderabbitai · 2026-02-03T10:52:36Z

📝 Walkthrough

Walkthrough

Adjusts MOE benchmark test-data generation to choose routing_logits dtype based on routing_method_type: for fp8 block-scale DeepSeekV3 (type 2) uses float32 and others use bfloat16; for fp8 per-tensor the dtype logic is inverted for DeepSeek type (uses bfloat16 for type 3 else float32).

Changes

Cohort / File(s)	Summary
MOE Test Data Generation `benchmarks/routines/moe.py`	Replaces hard-coded routing_logits dtypes with conditional selection based on `routing_method_type` across FP8 block-scale and FP8 per-tensor paths (DeepSeek-specific branches now set routing dtype explicitly).

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Possibly related PRs

Add data type check for deepseek fp4 moe #2165: Enforces/checks DeepSeekV3 routing_logits are float32 at runtime, aligning with these test-data dtype changes.
chore: MoE benchmark effective BW fix for trtllm_block_scale_moe #2341: Adds routing-method-aware logic to the same MOE test-data function and routing handling.

Suggested reviewers

Anerudhan
cyx-6
jiahanc

Poem

🐰 I hopped through tensors, soft and light,
I nudged the dtypes till they felt right.
DeepSeek gets float, others bfloat in view,
A tiny tweak — a rabbit's cue. 🥕✨

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 inconclusive)

Check name	Status	Explanation	Resolution
Description check	❓ Inconclusive	The description explains the change but lacks details on testing and pre-commit checks. It provides the motivation (launcher now supports bfloat16 for non-DS routing) but omits required checklist items.	Add sections for test updates, test results, and complete the pre-commit checks checklist to fully meet the repository's PR template requirements.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title accurately reflects the main change: blockscale MoE now supports non-DeepSeek routing with flexible dtype handling based on routing method type.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

gemini-code-assist

Code Review

This pull request correctly updates the data type for routing logits in the fp8_block_scale MoE routine to support non-DeepSeek routing methods with bfloat16. I've suggested a minor refactoring to improve code clarity and reduce duplication within the modified block. Additionally, this change makes the logic for fp8_block_scale identical to the existing logic for fp4_block_scale. You might consider refactoring these two blocks together in a follow-up to further improve maintainability.

benchmarks/routines/moe.py

bkryu

Thanks @hypdeb for the quick fix. Left one comment Gemini's comment. Otherwise, this should be good to go

benchmarks/routines/moe.py

bkryu

LGTM

…nfer-ai#2476)  ## 📌 Description The comment at this location previously stated that only `float32` routing logits would be possible for blockscale MoE. However, it seems the launcher now supports bfloat16 for non-DS routing.  ## Summary by CodeRabbit ## Bug Fixes * Per-method routing precision handling updated: the model now selects routing tensor precision based on the chosen routing method (e.g., DeepSeekV3 uses higher-precision), replacing fixed per-path precision. This improves routing stability and consistency across routing configurations without changing public APIs.

fix: blockscale moe routine supports non-DS routing

d0ab5f7

hypdeb requested review from Anerudhan, bkryu, cyx-6, jiahanc and nv-yunzheq as code owners February 3, 2026 10:52

gemini-code-assist bot reviewed Feb 3, 2026

View reviewed changes

benchmarks/routines/moe.py Outdated Show resolved Hide resolved

bkryu reviewed Feb 3, 2026

View reviewed changes

benchmarks/routines/moe.py Outdated Show resolved Hide resolved

Apply AI suggestions to all branches

7673c84

bkryu approved these changes Feb 3, 2026

View reviewed changes

bkryu merged commit 9e069e7 into flashinfer-ai:main Feb 3, 2026
20 checks passed

coderabbitai bot mentioned this pull request Feb 12, 2026

fix: support fp32 logits for fp8_per_tensor and fp8_block #2534

Open

5 tasks

hypdeb deleted the patch-1 branch March 3, 2026 08:01

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: blockscale moe routine supports non-DS routing#2476

fix: blockscale moe routine supports non-DS routing#2476
bkryu merged 2 commits intoflashinfer-ai:mainfrom
hypdeb:patch-1

hypdeb commented Feb 3, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

gemini-code-assist bot commented Feb 3, 2026

Uh oh!

coderabbitai bot commented Feb 3, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested reviewers

Poem

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

bkryu left a comment

Uh oh!

Uh oh!

bkryu left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

hypdeb commented Feb 3, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

📌 Description

Summary by CodeRabbit

Bug Fixes

Uh oh!

gemini-code-assist bot commented Feb 3, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

coderabbitai bot commented Feb 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested reviewers

Poem

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

bkryu left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

bkryu left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

hypdeb commented Feb 3, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Feb 3, 2026 •

edited

Loading