If change head_dim from 128 to 256 here
and run
pytest flashinfer/tests/attention/test_trtllm_gen_attention.py::test_trtllm_batch_decode
will see 756 failed tests.
FlashInfer version
uv pip show flashinfer-python
Name: flashinfer-python
Version: 0.4.1
Context: this head_dim from Qwen3-Next model.