[Platform] Add supports_torch_inductor interface#1108
[Platform] Add supports_torch_inductor interface#1108ZJY0516 merged 3 commits intovllm-project:mainfrom
Conversation
Signed-off-by: gcanlin <[email protected]>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: e2b032dbd4
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| dynamic=True, | ||
| from vllm_omni.platforms import current_omni_platform | ||
|
|
||
| if current_omni_platform.supports_torch_inductor(): |
There was a problem hiding this comment.
Guard against UnspecifiedOmniPlatform missing method
When no platform plugin is detected, current_omni_platform is an UnspecifiedOmniPlatform (see resolve_current_omni_platform_cls_qualname in vllm_omni/platforms/__init__.py), and that class does not override supports_torch_inductor (the base method in vllm_omni/platforms/interface.py raises NotImplementedError). With the new call here, any CPU/unknown environment running with enforce_eager=False will now crash during initialization instead of just skipping compilation. Consider providing a default implementation on UnspecifiedOmniPlatform (e.g., False) or guarding the call before invoking the method.
Useful? React with 👍 / 👎.
|
Please also verify that this change does not affect CUDA platform behavior. |
Signed-off-by: gcanlin <[email protected]>
CUDA: compiled by default. |
Signed-off-by: gcanlin <[email protected]>
PLEASE FILL IN THE PR DESCRIPTION HERE ENSURING ALL CHECKLIST ITEMS (AT THE BOTTOM) HAVE BEEN CONSIDERED.
Purpose
More and more users on NPU are facing the torch compile failing problem. Before we support multi-hardware torch compile backend, we need disable it by default for better users experience.
In the current code, we set
enforce_eager = Falseby default:Test Plan
When users disable
enforce_eagerparameter(this is the default behaviour), they wouldn't get the error on NPU. And the log will give the WARNING.Test Result
Essential Elements of an Effective PR Description Checklist
supported_models.mdandexamplesfor a new model.BEFORE SUBMITTING, PLEASE READ https://github.com/vllm-project/vllm-omni/blob/main/CONTRIBUTING.md (anything written below this line will be removed by GitHub Actions)