[Ascend]: Fixed the issue where OOT Platform vllm-ascend could not enable SP in Eager mode #28935

leo-pony · 2025-11-18T11:30:45Z

Before get lens of splitting_ops should first check it is not None; otherwise, in piecewise mode without set splitting_ops error will occur:

        if self.compilation_config.pass_config.enable_sequence_parallelism:
            # With pipeline parallelism or dynamo partitioning,
            # native rms norm tracing errors due to incorrect residual shape.
            # Use custom rms norm to unblock. In the future,
            # the pass will operate on higher-level IR to avoid the issue.
            # TODO: https://github.com/vllm-project/vllm/issues/27894
            is_fullgraph = (
                self.compilation_config.use_inductor_graph_partition
>               or len(self.compilation_config.splitting_ops) == 0
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
            )
E           TypeError: object of type 'NoneType' has no len()

The pull request for this issue is here: #27126

For the value of splitting_ops: If None, defaults to attention ops for piecewise cudagraphs , and If empty list [], no ops are excluded (suitable for full cudagraphs). Detail see comments from vllm/config/compilation.py:

splitting_ops: list[str] | None = None
"""A list of ops to exclude from cudagraphs, used in piecewise compilation.

The behavior depends on use_inductor_graph_partition:

- When use_inductor_graph_partition=False (default):
    These ops are used for Dynamo FX-level graph splitting. The graph is
    split at these ops before Inductor compilation, creating separate
    subgraphs for cudagraph capture.

- When use_inductor_graph_partition=True:
    These ops are used to register Inductor partition rules. The graph
    partitioning happens at Inductor codegen time after all passes and
    fusions are finished, allowing compilation and custom passes to operate
    on the full graph while still excluding these ops from cudagraphs.

If None, defaults to attention ops for piecewise cudagraphs.
If empty list [], no ops are excluded (suitable for full cudagraphs).
"""

Purpose

Test Plan

Test Result

Essential Elements of an Effective PR Description Checklist

The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
The test plan, such as providing test command.
The test results, such as pasting the results comparison before and after, or e2e results
(Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
(Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

gemini-code-assist

Code Review

This pull request correctly fixes a potential TypeError when splitting_ops is None. The added check prevents a crash. I've suggested a small improvement to make the code more concise and idiomatic, which enhances readability and maintainability.

vllm/config/vllm.py

leo-pony · 2025-11-18T12:15:40Z

@angelayi Could you please review this? Is this change appropriate since splitting_ops might be None?

leo-pony · 2025-11-18T12:23:20Z

Note: self.compilation_config.splitting_ops == [] has the same effect as or (self.compilation_config.splitting_ops is not None and len(self.compilation_config.splitting_ops) == 0).

angelayi

thanks!

yewentao256

Could you take a look the original cause, namely, why there is a None in the context and fix that?

ProExpertProg · 2025-11-18T22:56:18Z

vllm/config/vllm.py

            is_fullgraph = (
                self.compilation_config.use_inductor_graph_partition
-                or len(self.compilation_config.splitting_ops) == 0
+                or self.compilation_config.splitting_ops == []


Right above this line, we call self.compilation_config.set_splitting_ops_for_v1() which sets the splitting ops to a non-None value. What case are you running that has the SP pass enabled but compilation mode != VLLM_COMPILE? That's not a supported case.

I do agree we should handle invalid configuration gracefully. If you want to fix this, can you move the call to set_splitting_ops_for_v1 outside of the if, and just set to empty list if mode != VLLM_COMPILE? and then if enable_sequence_parallelism is True but mode is wrong, add a warning?

if enalbe sp with eager mode, the compile mode is None

vllm/vllm/config/vllm.py

Lines 440 to 444 in da94c7c

if self.compilation_config.mode is None:

if self.model_config is not None and not self.model_config.enforce_eager:

self.compilation_config.mode = CompilationMode.VLLM_COMPILE

else:

self.compilation_config.mode = CompilationMode.NONE

I know that sp should work with graph mode in vLLM. But for some platform, such as vllm-ascend, it doesn't work with graph pass currently, so it implements sp with another way to let sp work with eager mode. The e2e test is:
https://github.com/vllm-project/vllm-ascend/blob/main/tests/e2e/multicard/test_offline_inference_distributed.py#L169

Future plan:
We're working on custom graph pass feature #28623 Once it's done, we'll implement sp with the same way with vLLM.

Okay thanks for the clarification. Can you update set_splitting_ops_for_v1 so that splitting ops aren't empty during enforce eager?

Facing the same issue. @wangxiyuan Gentle ping if there's any update

@leo-pony please update the PR ASAP, thanks.

… Eager mode Signed-off-by: leo-pony <[email protected]>

…able SP in Eager mode (vllm-project#28935) Signed-off-by: leo-pony <[email protected]> Signed-off-by: Hashem Hashemi <[email protected]>

…able SP in Eager mode (vllm-project#28935) Signed-off-by: leo-pony <[email protected]> Signed-off-by: Bofeng BF1 Xue <[email protected]>

…able SP in Eager mode (vllm-project#28935) Signed-off-by: leo-pony <[email protected]> Signed-off-by: Xingyu Liu <[email protected]>

leo-pony requested review from ProExpertProg, WoosukKwon, hmellor, houseroad, mgoin, robertgshaw2-redhat, tlrmchlsmth, yewentao256 and youkaichao as code owners November 18, 2025 11:30

gemini-code-assist bot reviewed Nov 18, 2025

View reviewed changes

vllm/config/vllm.py Outdated Show resolved Hide resolved

leo-pony changed the title ~~[BugFix][NPU]: before get lens of splitting_ops should first check it is not None~~ [BugFix][NPU]: TypeError: object of type 'NoneType' has no len() for len(self.compilation_config.splitting_ops) Nov 18, 2025

leo-pony changed the title ~~[BugFix][NPU]: TypeError: object of type 'NoneType' has no len() for len(self.compilation_config.splitting_ops)~~ [BugFix]: TypeError: object of type 'NoneType' has no len() for len(self.compilation_config.splitting_ops) Nov 18, 2025

angelayi approved these changes Nov 18, 2025

View reviewed changes

yewentao256 reviewed Nov 18, 2025

View reviewed changes

ProExpertProg requested changes Nov 18, 2025

View reviewed changes

Fixed the issue where OOT Platform vllm-ascend could not enable SP in…

1a5a3e7

… Eager mode Signed-off-by: leo-pony <[email protected]>

leo-pony force-pushed the origin branch from c554d80 to 1a5a3e7 Compare November 27, 2025 14:54

leo-pony changed the title ~~[BugFix]: TypeError: object of type 'NoneType' has no len() for len(self.compilation_config.splitting_ops)~~ [Ascend]: Fixed the issue where OOT Platform vllm-ascend could not enable SP in Eager mode Nov 27, 2025

ProExpertProg approved these changes Nov 27, 2025

View reviewed changes

ProExpertProg added the ready ONLY add when PR is ready to merge/full CI is needed label Nov 27, 2025

ProExpertProg merged commit eaf8148 into vllm-project:main Dec 1, 2025
48 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[Ascend]: Fixed the issue where OOT Platform vllm-ascend could not enable SP in Eager mode #28935

[Ascend]: Fixed the issue where OOT Platform vllm-ascend could not enable SP in Eager mode #28935

Uh oh!

leo-pony commented Nov 18, 2025 •

edited by github-actions bot

Loading

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

leo-pony commented Nov 18, 2025

Uh oh!

leo-pony commented Nov 18, 2025

Uh oh!

angelayi left a comment

Uh oh!

yewentao256 left a comment

Uh oh!

ProExpertProg Nov 18, 2025

Uh oh!

wangxiyuan Nov 19, 2025 •

edited

Loading

Uh oh!

ProExpertProg Nov 19, 2025 •

edited

Loading

Uh oh!

22quinn Nov 20, 2025

Uh oh!

wangxiyuan Nov 21, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

	if self.compilation_config.mode is None:
	if self.model_config is not None and not self.model_config.enforce_eager:
	self.compilation_config.mode = CompilationMode.VLLM_COMPILE
	else:
	self.compilation_config.mode = CompilationMode.NONE

Uh oh!

[Ascend]: Fixed the issue where OOT Platform vllm-ascend could not enable SP in Eager mode #28935

[Ascend]: Fixed the issue where OOT Platform vllm-ascend could not enable SP in Eager mode #28935

Uh oh!

Conversation

leo-pony commented Nov 18, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Test Plan

Test Result

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

leo-pony commented Nov 18, 2025

Uh oh!

leo-pony commented Nov 18, 2025

Uh oh!

angelayi left a comment

Choose a reason for hiding this comment

Uh oh!

yewentao256 left a comment

Choose a reason for hiding this comment

Uh oh!

ProExpertProg Nov 18, 2025

Choose a reason for hiding this comment

Uh oh!

wangxiyuan Nov 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ProExpertProg Nov 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

22quinn Nov 20, 2025

Choose a reason for hiding this comment

Uh oh!

wangxiyuan Nov 21, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

leo-pony commented Nov 18, 2025 •

edited by github-actions bot

Loading

wangxiyuan Nov 19, 2025 •

edited

Loading

ProExpertProg Nov 19, 2025 •

edited

Loading