[Core] Deprecate `xformers` #29262

ywang96 · 2025-11-23T08:36:37Z

Purpose

Reopened from #28287

This PR completely removes the dependency of xformers library and should be only merged after v0.11.1 release. The rationale behind removing xformers is that:

xformers is used for multimodal attention (MHA) but we can have alternative attention backends to replace it
We have xformers attention backend for decoder LM, but it's no longer used for anything
Having another external dependency puts extra risks on our release - a hard lesson we learned from working on upgrading pytorch 2.9.
[Attention] FA2&FA3 support more head sizes, ViT support, make default backend #28763 added FA support for head sizes that we previously did not support, which make xformers no longer necessary.

Test Plan

Test Result

Essential Elements of an Effective PR Description Checklist

The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
The test plan, such as providing test command.
The test results, such as pasting the results comparison before and after, or e2e results
(Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
(Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

Signed-off-by: Roger Wang <[email protected]>

mergify · 2025-11-23T08:37:11Z

Documentation preview: https://vllm--29262.org.readthedocs.build/en/29262/

gemini-code-assist

Code Review

This pull request effectively deprecates the xformers dependency. The changes are comprehensive, removing xformers from requirements, Dockerfiles, documentation, and tests. The core logic is updated to remove the xformers attention backend, with TORCH_SDPA being used as a fallback in some cases, such as in the keye model. The pixtral model, which still relies on xformers, has been updated with a comment to clarify that xformers is now an optional dependency for that specific model. The changes are clean and well-aligned with the goal of deprecating xformers.

ywang96 · 2025-11-23T08:44:59Z

@codex review

ywang96 · 2025-11-23T08:45:25Z

Turning on CI to make sure there's no regression.

chatgpt-codex-connector · 2025-11-23T08:50:22Z

Codex Review: Didn't find any major issues. Keep it up!

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Signed-off-by: Roger Wang <[email protected]>

DarkLight1337

LGTM if tests pass

jeejeelee

Thank you for this great work

yewentao256

LGTM, thanks for the work!
The CI error seems related

[2025-11-23T09:50:01Z] __________ ERROR collecting tests/kernels/attention/test_attention.py __________
--
[2025-11-23T09:50:01Z] In test_num_heads_not_divisble_by_num_kv_heads: function uses no argument 'device'

mgoin · 2025-11-23T20:01:05Z

Can you link the PR that expanded FA2 to support the vision encoder support to help explain why we can remove this now?

Signed-off-by: Roger Wang <[email protected]>

Signed-off-by: Roger Wang <[email protected]> Signed-off-by: Runkai Tao <[email protected]>

Signed-off-by: Roger Wang <[email protected]>

2. Remove deprecated xformers (vllm-project#29262) 3. Updated _get_prompt_updates() Signed-off-by: Oscar Gonzalez <[email protected]>

1. fix vllm-project/vllm#28542 The model structure modifications we involved in are: - Qwen2.5-VL(still exist some patch) - Qwen2-VL - Qwen2 - DeepSeek series - Qwen-moe series 2. fix vllm-project/vllm#29121 the output token now type changed from np to `list[list[int]]` 3. fix vllm-project/vllm#29262 `xformers` backend for multimodal now has been deprecated 4. fix vllm-project/vllm#29342 5. fix vllm-project/vllm#28579 6. fix vllm-project/vllm#28718 7. fix vllm-project/vllm#28665 8. fix vllm-project/vllm#26847 vllm introduced the `optimization-level`, some default config has been changed, and the param `--enforce-eager` has been deprecated 9. fix http://github.com/vllm-project/vllm/pull/29223 it retuns tuple for sampler. 10. fix vllm-project/vllm#29471 we'll remove the related patch to avoid this kind of error. Co-authored-by: hfadzxy <[email protected]> Co-authored-by: wangli <[email protected]> - vLLM version: v0.11.2 --------- Signed-off-by: wangxiyuan <[email protected]> Signed-off-by: wangli <[email protected]> Signed-off-by: hfadzxy <[email protected]> Co-authored-by: wangli <[email protected]> Co-authored-by: hfadzxy <[email protected]>

remove

c80d28e

Signed-off-by: Roger Wang <[email protected]>

ywang96 requested review from LucasWilkinson, WoosukKwon, jeejeelee, mgoin, patrickvonplaten, sighingnow, tlrmchlsmth and yewentao256 as code owners November 23, 2025 08:36

ywang96 added the ready ONLY add when PR is ready to merge/full CI is needed label Nov 23, 2025

mergify bot added documentation Improvements or additions to documentation ci/build qwen Related to Qwen models nvidia labels Nov 23, 2025

github-project-automation bot added this to NVIDIA Nov 23, 2025

mergify bot added the v1 label Nov 23, 2025

gemini-code-assist bot reviewed Nov 23, 2025

View reviewed changes

add error message

f7caefb

Signed-off-by: Roger Wang <[email protected]>

ywang96 requested review from ProExpertProg, hmellor, houseroad, robertgshaw2-redhat and youkaichao as code owners November 23, 2025 08:57

refine

eb1e2af

Signed-off-by: Roger Wang <[email protected]>

DarkLight1337 approved these changes Nov 23, 2025

View reviewed changes

github-project-automation bot moved this to In review in NVIDIA Nov 23, 2025

DarkLight1337 enabled auto-merge (squash) November 23, 2025 09:20

Isotr0py approved these changes Nov 23, 2025

View reviewed changes

jeejeelee approved these changes Nov 23, 2025

View reviewed changes

yewentao256 approved these changes Nov 23, 2025

View reviewed changes

ywang96 added 2 commits November 23, 2025 17:54

Merge branch 'main' into remove-xformers-2

b732e24

typo

a52aaf7

Signed-off-by: Roger Wang <[email protected]>

DarkLight1337 merged commit 0ff7082 into vllm-project:main Nov 24, 2025
90 checks passed

github-project-automation bot moved this from In review to Done in NVIDIA Nov 24, 2025

lpapavassiliou pushed a commit to lpapavassiliou/vllm that referenced this pull request Nov 24, 2025

[Core] Deprecate xformers (vllm-project#29262)

8916b0b

Signed-off-by: Roger Wang <[email protected]>

RunkaiTao pushed a commit to RunkaiTao/vllm that referenced this pull request Nov 24, 2025

[Core] Deprecate xformers (vllm-project#29262)

92756cb

Signed-off-by: Roger Wang <[email protected]> Signed-off-by: Runkai Tao <[email protected]>

bringlein pushed a commit to bringlein/vllm that referenced this pull request Nov 26, 2025

[Core] Deprecate xformers (vllm-project#29262)

6b086a3

Signed-off-by: Roger Wang <[email protected]>

devpatelio pushed a commit to SumanthRH/vllm that referenced this pull request Nov 29, 2025

[Core] Deprecate xformers (vllm-project#29262)

54757a0

Signed-off-by: Roger Wang <[email protected]>

Potabk mentioned this pull request Dec 1, 2025

[Main] Upgrade vllm commit to 2025_12_01 vllm-project/vllm-ascend#4527

Closed

wangxiyuan mentioned this pull request Dec 1, 2025

upgrade vLLM to main vllm-project/vllm-ascend#4608

Merged

kitaekatt pushed a commit to kitaekatt/vllm that referenced this pull request Dec 1, 2025

[Core] Deprecate xformers (vllm-project#29262)

a46c7e8

Signed-off-by: Roger Wang <[email protected]>

oscardev256 added a commit to oscardev256/vllm that referenced this pull request Dec 2, 2025

1. Remove upstream fa checks (vllm-project#29471)

a69bdaf

2. Remove deprecated xformers (vllm-project#29262) 3. Updated _get_prompt_updates() Signed-off-by: Oscar Gonzalez <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[Core] Deprecate `xformers` #29262

[Core] Deprecate `xformers` #29262

Uh oh!

ywang96 commented Nov 23, 2025 •

edited by github-actions bot

Loading

Uh oh!

mergify bot commented Nov 23, 2025

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

ywang96 commented Nov 23, 2025

Uh oh!

ywang96 commented Nov 23, 2025

Uh oh!

chatgpt-codex-connector bot commented Nov 23, 2025

Uh oh!

DarkLight1337 left a comment

Uh oh!

jeejeelee left a comment

Uh oh!

yewentao256 left a comment

Uh oh!

mgoin commented Nov 23, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

Uh oh!

[Core] Deprecate xformers #29262

[Core] Deprecate xformers #29262

Uh oh!

Conversation

ywang96 commented Nov 23, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Test Plan

Test Result

Uh oh!

mergify bot commented Nov 23, 2025

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

ywang96 commented Nov 23, 2025

Uh oh!

ywang96 commented Nov 23, 2025

Uh oh!

chatgpt-codex-connector bot commented Nov 23, 2025

Uh oh!

DarkLight1337 left a comment

Choose a reason for hiding this comment

Uh oh!

jeejeelee left a comment

Choose a reason for hiding this comment

Uh oh!

yewentao256 left a comment

Choose a reason for hiding this comment

Uh oh!

mgoin commented Nov 23, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

[Core] Deprecate `xformers` #29262

[Core] Deprecate `xformers` #29262

ywang96 commented Nov 23, 2025 •

edited by github-actions bot

Loading