[Platform] Add supports_torch_inductor interface by gcanlin · Pull Request #1108 · vllm-project/vllm-omni

gcanlin · 2026-01-30T08:47:32Z

PLEASE FILL IN THE PR DESCRIPTION HERE ENSURING ALL CHECKLIST ITEMS (AT THE BOTTOM) HAVE BEEN CONSIDERED.

Purpose

More and more users on NPU are facing the torch compile failing problem. Before we support multi-hardware torch compile backend, we need disable it by default for better users experience.

In the current code, we set enforce_eager = False by default:

class OmniDiffusionConfig:
    ....
    enforce_eager: bool = False

Test Plan

When users disable enforce_eager parameter(this is the default behaviour), they wouldn't get the error on NPU. And the log will give the WARNING.

python text_to_image.py \
  --model Tongyi-MAI/Z-Image-Turbo \
  --prompt "a cup of coffee on the table" \
  --seed 42 \
  --cfg_scale 4.0 \
  --num_images_per_prompt 1 \
  --num_inference_steps 50 \
  --height 1024 \
  --width 1024 \
  --output outputs/coffee.png

Test Result

[Stage-0] INFO 01-30 08:38:47 [diffusers_loader.py:227] Loading weights took 6.47 seconds
[Stage-0] INFO 01-30 08:38:47 [diffusion_model_runner.py:100] Model loading took 53.7445 GiB and 13.569574 seconds
[Stage-0] INFO 01-30 08:38:47 [diffusion_model_runner.py:105] Model runner: Model loaded successfully.
[Stage-0] WARNING 01-30 08:38:47 [diffusion_model_runner.py:134] Model runner: Platform npu does not support torch inductor, skipping torch.compile.
[Stage-0] INFO 01-30 08:38:47 [cache_dit_backend.py:379] Enabling cache-dit on transformer: Fn=1, Bn=0, W=4,

Essential Elements of an Effective PR Description Checklist

The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
The test plan, such as providing test command.
The test results, such as pasting the results comparison before and after, or e2e results
(Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
(Optional) Release notes update. If your change is user facing, please update the release notes draft.

BEFORE SUBMITTING, PLEASE READ https://github.com/vllm-project/vllm-omni/blob/main/CONTRIBUTING.md (anything written below this line will be removed by GitHub Actions)

Signed-off-by: gcanlin <[email protected]>

gcanlin · 2026-01-30T08:48:09Z

cc @ZJY0516 @faaany @tjtanaa

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: e2b032dbd4

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

chatgpt-codex-connector · 2026-01-30T08:49:14Z

vllm_omni/diffusion/worker/diffusion_model_runner.py

-                    dynamic=True,
+            from vllm_omni.platforms import current_omni_platform
+
+            if current_omni_platform.supports_torch_inductor():


Guard against UnspecifiedOmniPlatform missing method

When no platform plugin is detected, current_omni_platform is an UnspecifiedOmniPlatform (see resolve_current_omni_platform_cls_qualname in vllm_omni/platforms/__init__.py), and that class does not override supports_torch_inductor (the base method in vllm_omni/platforms/interface.py raises NotImplementedError). With the new call here, any CPU/unknown environment running with enforce_eager=False will now crash during initialization instead of just skipping compilation. Consider providing a default implementation on UnspecifiedOmniPlatform (e.g., False) or guarding the call before invoking the method.

Useful? React with 👍 / 👎.

vllm_omni/diffusion/worker/diffusion_model_runner.py

ZJY0516 · 2026-01-30T12:27:33Z

Please also verify that this change does not affect CUDA platform behavior.

Signed-off-by: gcanlin <[email protected]>

gcanlin · 2026-01-30T13:47:25Z

Please also verify that this change does not affect CUDA platform behavior.

CUDA: compiled by default.

python text_to_image.py \
  --model Tongyi-MAI/Z-Image-Turbo \
  --prompt "a cup of coffee on the table" \
  --seed 42 \
  --cfg_scale 4.0 \
  --num_images_per_prompt 1 \
  --num_inference_steps 50 \
  --height 1024 \
  --width 1024 \
  --output outputs/coffee.png

[Stage-0] INFO 01-30 13:40:23 [diffusers_loader.py:227] Loading weights took 67.10 seconds
[Stage-0] INFO 01-30 13:40:24 [diffusion_model_runner.py:101] Model loading took 19.1516 GiB and 91.199319 seconds
[Stage-0] INFO 01-30 13:40:24 [diffusion_model_runner.py:106] Model runner: Model loaded successfully.
[Stage-0] INFO 01-30 13:40:24 [diffusion_model_runner.py:129] Model runner: Model compiled with torch.compile.
[Stage-0] INFO 01-30 13:40:24 [diffusion_model_runner.py:144] Model runner: Initialization complete.
[Stage-0] INFO 01-30 13:40:24 [manager.py:90] Initializing DiffusionLoRAManager: device=cuda:0, dtype=torch.bfloat16, max_cached_adapters=1, static_lora_path=None

faaany · 2026-01-30T14:24:19Z

cc @yma11 @zhenwei-intel @xinyu-intel

Signed-off-by: gcanlin <[email protected]>

[Platform] Add supports_torch_inductor interface

e2b032d

Signed-off-by: gcanlin <[email protected]>

gcanlin requested a review from hsliuustc0106 as a code owner January 30, 2026 08:47

chatgpt-codex-connector bot reviewed Jan 30, 2026

View reviewed changes

david6666666 added this to the v0.14.0 milestone Jan 30, 2026

ZJY0516 approved these changes Jan 30, 2026

View reviewed changes

vllm_omni/diffusion/worker/diffusion_model_runner.py Outdated Show resolved Hide resolved

ZJY0516 added the ready label to trigger buildkite CI label Jan 30, 2026

global import

bcd1aa8

Signed-off-by: gcanlin <[email protected]>

ZJY0516 enabled auto-merge (squash) January 30, 2026 13:48

hsliuustc0106 added the Hardware Plugin support different hardware beyond cuda label Jan 30, 2026

Merge branch 'main' into eager-default

b36bd88

ZJY0516 merged commit 4eeea68 into vllm-project:main Jan 30, 2026
7 checks passed

dongbo910220 pushed a commit to dongbo910220/vllm-omni that referenced this pull request Feb 1, 2026

[Platform] Add supports_torch_inductor interface (vllm-project#1108)

83af985

Signed-off-by: gcanlin <[email protected]>

gcanlin mentioned this pull request Feb 14, 2026

[RFC]: vLLM-Omni NPU 2026 Q1 Roadmap #886

Open

29 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comments

[Platform] Add supports_torch_inductor interface#1108

[Platform] Add supports_torch_inductor interface#1108
ZJY0516 merged 3 commits intovllm-project:mainfrom
gcanlin:eager-default

gcanlin commented Jan 30, 2026 •

edited

Loading

Uh oh!

gcanlin commented Jan 30, 2026

Uh oh!

chatgpt-codex-connector bot left a comment

Uh oh!

chatgpt-codex-connector bot Jan 30, 2026

Uh oh!

Uh oh!

ZJY0516 commented Jan 30, 2026

Uh oh!

gcanlin commented Jan 30, 2026 •

edited

Loading

Uh oh!

faaany commented Jan 30, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Comments

Conversation

gcanlin commented Jan 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Test Plan

Test Result

Uh oh!

gcanlin commented Jan 30, 2026

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector bot Jan 30, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

ZJY0516 commented Jan 30, 2026

Uh oh!

gcanlin commented Jan 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

faaany commented Jan 30, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

gcanlin commented Jan 30, 2026 •

edited

Loading

gcanlin commented Jan 30, 2026 •

edited

Loading