unittest: Add head dim 256 test cases and mark as xfail #1999

bkryu · 2025-10-28T18:53:33Z

📌 Description

Adding unit test for head_dim=256 cases for trtllm-gen decode and marking them as xfail.

Renames test_trtllm_batch_decode to _test_trtllm_batch_decode as a base function. Test functions now call _test_trtllm_batch_decode with a group of parameter combinations:

test_trtllm_batch_decode --> 1632 existing parameter combinations
test_trtllm_batch_decode_bs1 --> 1 xfail case with batch size 1
test_trtllm_batch_decode_head_dim_256 --> 40 xfail cases with head_dim=256.
test_trtllm_batch_decode_long_sequence_length --> 48 cases of long seqlen.

🔍 Related Issues

#1993

🚀 Pull Request Checklist

Thank you for contributing to FlashInfer! Before we review your pull request, please make sure the following items are complete.

✅ Pre-commit Checks

I have installed pre-commit by running pip install pre-commit (or used your preferred method).
I have installed the hooks with pre-commit install.
I have run the hooks manually with pre-commit run --all-files and fixed any reported issues.

If you are unsure about how to set up pre-commit, see the pre-commit documentation.

🧪 Tests

Tests have been added or updated as needed.
All tests are passing (unittest, etc.).

Reviewer Notes

Summary by CodeRabbit

Tests
- Refactored batch-decode tests into a parameterized wrapper to improve coverage across head-dimension variants.
- Added new scenarios including head_dim=256 and long-sequence-length cases (head_dim=256 marked as expected failures).
- Updated single-batch tests to include head-dimension parameterization.
- Removed direct execution blocks to standardize on pytest-based test runs.

coderabbitai · 2025-10-28T18:53:59Z

Walkthrough

Tests updated to parameterize batch-decode tests with a new head_dim parameter, add head_dim=256 variants (marked xfail), refactor common helper _test_trtllm_batch_decode, and remove the if __name__ == "__main__" invocation block for pytest execution.

Changes

Cohort / File(s)	Summary
Test Parameterization & New Head-Dim Variants `tests/attention/test_trtllm_gen_attention.py`	Added `head_dim` parameter to `test_trtllm_batch_decode` and `test_trtllm_batch_decode_bs1`; introduced `_test_trtllm_batch_decode` helper; added `test_trtllm_batch_decode_head_dim_256` and long-sequence/head_dim=256 variants (marked xfail); expanded parameterized test combinations; removed `__main__` block.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Review points:
- Ensure head_dim is passed through all helper and test call paths.
- Confirm xfail markers are correctly applied to head_dim=256 variants.
- Verify removal of __main__ doesn't affect local debug patterns.

Poem

🐰 I hopped through tests with nimble pace,
I passed a head_dim into the race,
Two-five-six waves a cheeky sigh,
Parametric hops reach for the sky,
A rabbit cheers for cleaner test-space.

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 14.29% which is insufficient. The required threshold is 80.00%.	You can run `@coderabbitai generate docstrings` to improve docstring coverage.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Title Check	✅ Passed	The PR title "unittest: Add head dim 256 test cases and mark as xfail" directly corresponds to the main changes in the changeset. The raw summary confirms that new test variants for head_dim=256 were introduced and marked as xfail, which is exactly what the title describes. The title is concise, clear, and specific enough that a teammate reviewing the git history would understand the primary change without needing to read the full diff. While the title doesn't mention secondary changes like the test refactoring or removal of main blocks, this is appropriate and expected per the guidelines.
Description Check	✅ Passed	The pull request description follows the required template structure and includes all mandatory sections. The 📌 Description section clearly explains the changes (adding unit tests for `head_dim=256` cases marked as xfail, refactoring test functions with parametrization), the 🔍 Related Issues section links to issue #1993, and the 🚀 Pull Request Checklist is complete with all three pre-commit checks and both test items marked as completed. The Reviewer Notes section is left blank, which is acceptable since the template explicitly marks it as optional. The description provides specific, actionable information about the test reorganization and new test variants created.

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

yzh119 · 2025-10-28T19:01:35Z

tests/attention/test_trtllm_gen_attention.py

+    head_dim,
+):
+    pytest.xfail("trtllm-gen decode gets incorrect output with head_dim = 256")
+    test_trtllm_batch_decode(


It's unusual practice (but it's totally okay to do so and it's not introduced in this PR) to call one test_* function instead another one, all top-level functions with prefix test_ will be treated as standalone unittests.

Can we create a function _test_trtllm_batch_decode as the common body of these unittests, instead of calling another top-level test_trtllm_batch_decode function?

Hi @yzh119, I think this PR is a good opportunity to do make this change.

I have:

renamed test_trtllm_batch_decode to _test_trtllm_batch_decode as a base function.

There are test functions that call _test_trtllm_batch_decode with a group of parameter combinations:

test_trtllm_batch_decode --> 1632 existing parameter combinations

test_trtllm_batch_decode_bs1 --> 1 xfail case with batch size 1

test_trtllm_batch_decode_head_dim_256 --> 40 xfail cases with head_dim=256.

test_trtllm_batch_decode_long_sequence_length --> 48 cases of long seqlen.

The long seqlen was added because I saw #1968 and tested what happens if try testing long seqlens. We start to see failures starting from 4k

yzh119

LGTM

yzh119 · 2025-10-29T00:18:44Z

/bot run

flashinfer-bot · 2025-10-29T00:19:15Z

GitLab MR !96 has been created, and the CI pipeline #37485772 is currently running. I'll report back once the pipeline job completes.

flashinfer-bot · 2025-10-29T06:05:37Z

[SUCCESS] Pipeline #37485772: 13/17 passed

bkryu changed the title ~~Add head dim 256 test cases and mark as xfial~~ unittest: Add head dim 256 test cases and mark as xfial Oct 28, 2025

bkryu changed the title ~~unittest: Add head dim 256 test cases and mark as xfial~~ unittest: Add head dim 256 test cases and mark as xfail Oct 28, 2025

bkryu marked this pull request as ready for review October 28, 2025 18:54

yzh119 reviewed Oct 28, 2025

View reviewed changes

bkryu marked this pull request as draft October 28, 2025 20:32

bkryu added 3 commits October 28, 2025 21:26

Add head dim 256 test cases and mark as xfial

06b331c

Refactor tests to base test

9b6e17c

Adding long seqlen tests

097308f

bkryu force-pushed the trtllm_gen_decode_head_256 branch from 1993033 to 097308f Compare October 28, 2025 21:46

bkryu marked this pull request as ready for review October 28, 2025 21:47

yzh119 approved these changes Oct 29, 2025

View reviewed changes

yzh119 merged commit ebb610c into flashinfer-ai:main Oct 29, 2025
4 checks passed

bkryu deleted the trtllm_gen_decode_head_256 branch October 30, 2025 17:18

bkryu mentioned this pull request Oct 30, 2025

[Bug]: TRTLLM attention + full cudagraph produces incorrect output at long context length (>128k) on Blackwell #1968

Open

coderabbitai bot mentioned this pull request Oct 31, 2025

test: Enable xfailed trtllm decode long seqlen tests and update microbenchmark #2018

Merged

5 tasks

This was referenced Nov 18, 2025

test: Enable testing for trtllm-gen decode bs1 #2103

Merged

fix: some bugs of headDim 256 trtllm-gen fmha kernels. #2137

Merged

feat: add trtllm-gen per-tensor sparseMla kernels. #2138

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

unittest: Add head dim 256 test cases and mark as xfail #1999

unittest: Add head dim 256 test cases and mark as xfail #1999

Uh oh!

bkryu commented Oct 28, 2025 •

edited

Loading

Uh oh!

coderabbitai bot commented Oct 28, 2025 •

edited

Loading

Uh oh!

yzh119 Oct 28, 2025

Uh oh!

bkryu Oct 28, 2025

Uh oh!

bkryu Oct 28, 2025

Uh oh!

yzh119 left a comment

Uh oh!

yzh119 commented Oct 29, 2025

Uh oh!

flashinfer-bot commented Oct 29, 2025

Uh oh!

flashinfer-bot commented Oct 29, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

unittest: Add head dim 256 test cases and mark as xfail #1999

unittest: Add head dim 256 test cases and mark as xfail #1999

Uh oh!

Conversation

bkryu commented Oct 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

📌 Description

🔍 Related Issues

🚀 Pull Request Checklist

✅ Pre-commit Checks

🧪 Tests

Reviewer Notes

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Oct 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Poem

Pre-merge checks and finishing touches

Uh oh!

yzh119 Oct 28, 2025

Choose a reason for hiding this comment

Uh oh!

bkryu Oct 28, 2025

Choose a reason for hiding this comment

Uh oh!

bkryu Oct 28, 2025

Choose a reason for hiding this comment

Uh oh!

yzh119 left a comment

Choose a reason for hiding this comment

Uh oh!

yzh119 commented Oct 29, 2025

Uh oh!

flashinfer-bot commented Oct 29, 2025

Uh oh!

flashinfer-bot commented Oct 29, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

bkryu commented Oct 28, 2025 •

edited

Loading

coderabbitai bot commented Oct 28, 2025 •

edited

Loading