[Bugfix] fix custom op GmmSwigluQuantWeightNzTensorList #4593

ChenxiQ · 2025-12-01T06:19:28Z

What this PR does / why we need it?

Fixes the environment path used to locate custom op shared libraries.
Uses empty tensor initialization for op outputs instead of zero-initialization for better efficiency.

Does this PR introduce any user-facing change?

How was this patch tested?

vLLM version: v0.11.2
vLLM main: https://github.com/vllm-project/vllm/commit/v0.11.2

Signed-off-by: QianChenxi <[email protected]>

gemini-code-assist

Code Review

This pull request includes a bug fix and a performance optimization. In vllm_ascend/platform.py, a duplicated path segment in the construction of CUSTOM_OPP_PATH is corrected, which is a necessary fix. In csrc/torch_binding.cpp, at::zeros is replaced with at::empty for output tensor allocation, which is a good performance improvement. However, I've identified a related critical issue that should be addressed.

gemini-code-assist · 2025-12-01T06:21:40Z

csrc/torch_binding.cpp

+    at::Tensor output = at::empty({m, n/2}, x.options().dtype(at::kChar));
+    at::Tensor output_scale = at::empty({m}, x.options().dtype(at::kFloat));
+    at::Tensor output_offset = at::empty({m}, x.options().dtype(at::kFloat));


The size of these output tensors depends on n, which is calculated on line 567 from the weight TensorList (weight[0].sizes()[1]). This is unsafe because if the weight TensorList is empty, accessing weight[0] will cause a crash. It's crucial to add a check to ensure weight is not empty before its elements are accessed.

For example, you could add the following check before line 567:

TORCH_CHECK(!weight.empty(), "weight tensor list cannot be empty");

github-actions · 2025-12-01T06:30:22Z

👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:‌‌

A PR should do only one thing, smaller PRs enable faster reviews.
Every PR should include unit tests and end-to-end tests ‌to ensure it works and is not broken by other future PRs.
Write the commit message by fulfilling the PR description to help reviewer and future developers understand.

If CI fails, you can run linting and testing checks locally according Contributing and Testing.

Signed-off-by: QianChenxi <[email protected]>

hukongyi · 2025-12-03T03:45:28Z

vllm_ascend/utils.py

    global _CUSTOM_OP_ENABLED
+
+    # set custom ops path
+    CUR_DIR = os.path.dirname(os.path.realpath(__file__))


@wangxiyuan This line causes the test tests/e2e/multicard/test_offline_inference_distributed.py::test_models_distributed_QwQ to fail.
os.path.realpath involves system calls that are not supported by torch.compile (Dynamo) during graph capture. This triggers a runtime crash when graph mode is enabled.
Maybe move the CUR_DIR calculation to the module level (global scope) to avoid this trace error and unnecessary re-calculation.

yes, I found it. Thanks for reminding

@hukongyi #4675 I think this one should fix the CI

…-project#4593)" This reverts commit 4588cda.

…#4593) ### What this PR does / why we need it? 1. Fixes the environment path used to locate custom op shared libraries. 2. Uses empty tensor initialization for op outputs instead of zero-initialization for better efficiency. - vLLM version: v0.11.2 - vLLM main: https://github.com/vllm-project/vllm/commit/v0.11.2 --------- Signed-off-by: QianChenxi <[email protected]>

…#4593) ### What this PR does / why we need it? 1. Fixes the environment path used to locate custom op shared libraries. 2. Uses empty tensor initialization for op outputs instead of zero-initialization for better efficiency. - vLLM version: v0.11.2 - vLLM main: https://github.com/vllm-project/vllm/commit/v0.11.2 --------- Signed-off-by: QianChenxi <[email protected]> Signed-off-by: Che Ruan <[email protected]>

…#4593) ### What this PR does / why we need it? 1. Fixes the environment path used to locate custom op shared libraries. 2. Uses empty tensor initialization for op outputs instead of zero-initialization for better efficiency. - vLLM version: v0.11.2 - vLLM main: https://github.com/vllm-project/vllm/commit/v0.11.2 --------- Signed-off-by: QianChenxi <[email protected]>

custom op output init fix

fde66f3

Signed-off-by: QianChenxi <[email protected]>

gemini-code-assist bot reviewed Dec 1, 2025

View reviewed changes

github-actions bot added the module:core label Dec 1, 2025

ChenxiQ force-pushed the br_custom_op_fix branch from a34ddcf to 71cf0c4 Compare December 1, 2025 06:30

ChenxiQ changed the title ~~[Bugfix] custom op fix~~ [Bugfix] fix custom op GmmSwigluQuantWeightNzTensorList Dec 1, 2025

ChenxiQ force-pushed the br_custom_op_fix branch 3 times, most recently from b1c3922 to 29191fb Compare December 2, 2025 13:04

custom op env fix

5486458

Signed-off-by: QianChenxi <[email protected]>

ChenxiQ force-pushed the br_custom_op_fix branch from 29191fb to 5486458 Compare December 2, 2025 13:21

wangxiyuan approved these changes Dec 2, 2025

View reviewed changes

wangxiyuan merged commit 4588cda into vllm-project:main Dec 2, 2025
21 checks passed

hukongyi reviewed Dec 3, 2025

View reviewed changes

wangxiyuan added a commit to wangxiyuan/vllm-ascend that referenced this pull request Dec 3, 2025

Revert "[Bugfix] fix custom op GmmSwigluQuantWeightNzTensorList (vllm…

823b983

…-project#4593)" This reverts commit 4588cda.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Bugfix] fix custom op GmmSwigluQuantWeightNzTensorList #4593

[Bugfix] fix custom op GmmSwigluQuantWeightNzTensorList #4593

ChenxiQ commented Dec 1, 2025 •

edited by github-actions bot

Loading

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Dec 1, 2025

Uh oh!

github-actions bot commented Dec 1, 2025

Uh oh!

Uh oh!

hukongyi Dec 3, 2025

Uh oh!

wangxiyuan Dec 3, 2025

Uh oh!

wangxiyuan Dec 3, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

[Bugfix] fix custom op GmmSwigluQuantWeightNzTensorList #4593

[Bugfix] fix custom op GmmSwigluQuantWeightNzTensorList #4593

Conversation

ChenxiQ commented Dec 1, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What this PR does / why we need it?

Does this PR introduce any user-facing change?

How was this patch tested?

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Dec 1, 2025

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Dec 1, 2025

Uh oh!

Uh oh!

hukongyi Dec 3, 2025

Choose a reason for hiding this comment

Uh oh!

wangxiyuan Dec 3, 2025

Choose a reason for hiding this comment

Uh oh!

wangxiyuan Dec 3, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

ChenxiQ commented Dec 1, 2025 •

edited by github-actions bot

Loading