[ROCm][AITER] bugfix accuracy regression in ROCM_AITER_TRITON_MLA backend #31816

vllmellm · 2026-01-06T16:03:12Z

Purpose

Running deepseek using the AITER_TRITON_MLA backend results in low accuracy as follow:
command:
export VLLM_USE_V1=1 export SAFETENSORS_FAST_GPU=1 export VLLM_ROCM_USE_AITER=1 export VLLM_ATTENTION_BACKEND=ROCM_AITER_TRITON_MLA vllm serve deepseek-ai/DeepSeek-V3 -tp 8
lm_eval score:

Tasks	Version	Filter	n-shot	Metric		Value		Stderr
gsm8k	3	flexible-extract	5	exact_match	↑	0.0067	±	0.0047
		strict-match	5	exact_match	↑	0.0000	±	0.0000

This PR fixes the issue by inheriting from AiterMLABackend instead of MLACommonBackend since AiterMLABackend in shared for aiter triton backend as well.

Test Plan

verify lm_eval score

export VLLM_USE_V1=1 export SAFETENSORS_FAST_GPU=1 export VLLM_ROCM_USE_AITER=1 export VLLM_ATTENTION_BACKEND=ROCM_AITER_TRITON_MLA vllm serve deepseek-ai/DeepSeek-V3 -tp 8

lm_eval --model local-completions \ --tasks gsm8k \ --model_args model=deepseek-ai/DeepSeek-V3,base_url=http://localhost:8000/v1/completions \ --trust_remote_code \ --num_fewshot 5 \ --limit 300 \ --batch_size 128

Test Result

Tasks	Version	Filter	n-shot	Metric		Value		Stderr
gsm8k	3	flexible-extract	5	exact_match	↑	0.96	±	0.0113
		strict-match	5	exact_match	↑	0.96	±	0.0113

Essential Elements of an Effective PR Description Checklist

The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
The test plan, such as providing test command.
The test results, such as pasting the results comparison before and after, or e2e results
(Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
(Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

Signed-off-by: vllmellm <[email protected]>

gemini-code-assist

Code Review

This pull request addresses a critical accuracy regression in the ROCM_AITER_TRITON_MLA attention backend. The fix involves changing the base class of AiterTritonMLABackend from MLACommonBackend to AiterMLABackend. This change correctly allows AiterTritonMLABackend to inherit necessary configurations, like get_supported_kernel_block_sizes, from the shared AiterMLABackend, which is the intended base for AITer backends on ROCm. The changes also include cleaning up imports and removing a redundant method override, improving code clarity. The fix is logical, well-contained, and its effectiveness is demonstrated by the significant improvement in the provided lm_eval scores.

tjtanaa

LGTM. Thank you for fixing it.

…kend (vllm-project#31816) Signed-off-by: vllmellm <[email protected]>

bugfix inherit from AiterMLABackend for AiterTritonMLABackend

368c321

Signed-off-by: vllmellm <[email protected]>

vllmellm requested a review from tjtanaa as a code owner January 6, 2026 16:03

mergify bot added rocm Related to AMD ROCm v1 labels Jan 6, 2026

gemini-code-assist bot reviewed Jan 6, 2026

View reviewed changes

tjtanaa approved these changes Jan 7, 2026

View reviewed changes

tjtanaa added the ready ONLY add when PR is ready to merge/full CI is needed label Jan 7, 2026

tjtanaa enabled auto-merge (squash) January 7, 2026 03:08

tjtanaa merged commit 6409004 into vllm-project:main Jan 7, 2026
54 of 55 checks passed

yugong333 pushed a commit to yugong333/vllm that referenced this pull request Jan 9, 2026

[ROCm][AITER] bugfix accuracy regression in ROCM_AITER_TRITON_MLA bac…

b1e3ce6

…kend (vllm-project#31816) Signed-off-by: vllmellm <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[ROCm][AITER] bugfix accuracy regression in ROCM_AITER_TRITON_MLA backend #31816

[ROCm][AITER] bugfix accuracy regression in ROCM_AITER_TRITON_MLA backend #31816

vllmellm commented Jan 6, 2026 •

edited by github-actions bot

Loading

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

tjtanaa left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

[ROCm][AITER] bugfix accuracy regression in ROCM_AITER_TRITON_MLA backend #31816

[ROCm][AITER] bugfix accuracy regression in ROCM_AITER_TRITON_MLA backend #31816

Conversation

vllmellm commented Jan 6, 2026 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Test Plan

Test Result

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

tjtanaa left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

vllmellm commented Jan 6, 2026 •

edited by github-actions bot

Loading