Skip to content

Conversation

@hmellor
Copy link
Member

@hmellor hmellor commented Nov 12, 2025

In Transformers v5:

  • rope_scaling is now called rope_parameters
  • rope_theta now lives inside rope_parameters
  • rope_parameters may be nested for models which have different RoPE parameters for each layer type (i.e. Gemma & ModernBERT)

This PR adds forward compatibility for Transformesr v5 RoPE config by:

  • Moving any found config.rope_scaling to config.rope_parameters
  • Moving any found config.rope_theta to config.rope_parameters.rope_theta
  • Performs parch_rope_parameters on all nested configs if present
  • Performs patch_rope_parameters_dict on all nested RoPE parameters if present
  • Globally renaming rope_scaling to rope_parameters
  • get_rope:
    • Remove base as an argument because it no longer needs to be passed separately
    • If rope_parameters is None, default to rope base of 10000 which seems to be a universal default
  • Any models which do not use this 10000 default have it set using the set_default_rope_theta helper

Note, the errors triggered by disable_sliding_window when used with rope scaling models have been removed. It's been left as a follow up task remove disable_sliding_window completely as it is no longer relevant.

@mergify
Copy link

mergify bot commented Nov 12, 2025

Documentation preview: https://vllm--28542.org.readthedocs.build/en/28542/

@mergify mergify bot added documentation Improvements or additions to documentation llama Related to Llama models performance Performance-related issues qwen Related to Qwen models gpt-oss Related to GPT-OSS models speculative-decoding labels Nov 12, 2025
@mergify mergify bot added the ci/build label Nov 13, 2025
Signed-off-by: Harry Mellor <[email protected]>
Signed-off-by: Harry Mellor <[email protected]>
Signed-off-by: Harry Mellor <[email protected]>
@vllm-bot vllm-bot merged commit a8b7030 into vllm-project:main Nov 19, 2025
55 of 57 checks passed
@github-project-automation github-project-automation bot moved this from To Triage to Done in gpt-oss Issues & Enhancements Nov 19, 2025
@hmellor hmellor deleted the update-rope-config branch November 19, 2025 18:32
Victor49152 pushed a commit to Victor49152/vllm that referenced this pull request Nov 20, 2025
LuminolT pushed a commit to LuminolT/vllm that referenced this pull request Nov 21, 2025
@juliendenize juliendenize mentioned this pull request Nov 21, 2025
5 tasks
bigPYJ1151 pushed a commit that referenced this pull request Nov 25, 2025
bringlein pushed a commit to bringlein/vllm that referenced this pull request Nov 26, 2025
devpatelio pushed a commit to SumanthRH/vllm that referenced this pull request Nov 29, 2025
kitaekatt pushed a commit to kitaekatt/vllm that referenced this pull request Dec 1, 2025
wangxiyuan added a commit to vllm-project/vllm-ascend that referenced this pull request Dec 2, 2025
1. fix vllm-project/vllm#28542
The model structure modifications we involved in are:
     - Qwen2.5-VL(still exist some patch)
     - Qwen2-VL
     - Qwen2
     - DeepSeek series
     - Qwen-moe series
2. fix vllm-project/vllm#29121
   the output token now  type changed from np to `list[list[int]]`

3. fix vllm-project/vllm#29262
    `xformers` backend for multimodal now has been deprecated
4. fix vllm-project/vllm#29342

5. fix vllm-project/vllm#28579
6. fix vllm-project/vllm#28718
7. fix vllm-project/vllm#28665
8. fix vllm-project/vllm#26847
vllm introduced the `optimization-level`, some default config has been
changed, and the param `--enforce-eager` has been deprecated
9. fix http://github.com/vllm-project/vllm/pull/29223 it retuns tuple
for sampler.
10. fix vllm-project/vllm#29471 we'll remove the
related patch to avoid this kind of error.

Co-authored-by: hfadzxy <[email protected]>
Co-authored-by: wangli <[email protected]>


- vLLM version: v0.11.2

---------

Signed-off-by: wangxiyuan <[email protected]>
Signed-off-by: wangli <[email protected]>
Signed-off-by: hfadzxy <[email protected]>
Co-authored-by: wangli <[email protected]>
Co-authored-by: hfadzxy <[email protected]>
ChenCangtao pushed a commit to ChenCangtao/vllm-ascend that referenced this pull request Dec 3, 2025
1. fix vllm-project/vllm#28542
The model structure modifications we involved in are:
     - Qwen2.5-VL(still exist some patch)
     - Qwen2-VL
     - Qwen2
     - DeepSeek series
     - Qwen-moe series
2. fix vllm-project/vllm#29121
   the output token now  type changed from np to `list[list[int]]`

3. fix vllm-project/vllm#29262
    `xformers` backend for multimodal now has been deprecated
4. fix vllm-project/vllm#29342

5. fix vllm-project/vllm#28579
6. fix vllm-project/vllm#28718
7. fix vllm-project/vllm#28665
8. fix vllm-project/vllm#26847
vllm introduced the `optimization-level`, some default config has been
changed, and the param `--enforce-eager` has been deprecated
9. fix http://github.com/vllm-project/vllm/pull/29223 it retuns tuple
for sampler.
10. fix vllm-project/vllm#29471 we'll remove the
related patch to avoid this kind of error.

Co-authored-by: hfadzxy <[email protected]>
Co-authored-by: wangli <[email protected]>


- vLLM version: v0.11.2

---------

Signed-off-by: wangxiyuan <[email protected]>
Signed-off-by: wangli <[email protected]>
Signed-off-by: hfadzxy <[email protected]>
Co-authored-by: wangli <[email protected]>
Co-authored-by: hfadzxy <[email protected]>
Mercykid-bash pushed a commit to Mercykid-bash/vllm-ascend that referenced this pull request Dec 4, 2025
1. fix vllm-project/vllm#28542
The model structure modifications we involved in are:
     - Qwen2.5-VL(still exist some patch)
     - Qwen2-VL
     - Qwen2
     - DeepSeek series
     - Qwen-moe series
2. fix vllm-project/vllm#29121
   the output token now  type changed from np to `list[list[int]]`

3. fix vllm-project/vllm#29262
    `xformers` backend for multimodal now has been deprecated
4. fix vllm-project/vllm#29342

5. fix vllm-project/vllm#28579
6. fix vllm-project/vllm#28718
7. fix vllm-project/vllm#28665
8. fix vllm-project/vllm#26847
vllm introduced the `optimization-level`, some default config has been
changed, and the param `--enforce-eager` has been deprecated
9. fix http://github.com/vllm-project/vllm/pull/29223 it retuns tuple
for sampler.
10. fix vllm-project/vllm#29471 we'll remove the
related patch to avoid this kind of error.

Co-authored-by: hfadzxy <[email protected]>
Co-authored-by: wangli <[email protected]>

- vLLM version: v0.11.2

---------

Signed-off-by: wangxiyuan <[email protected]>
Signed-off-by: wangli <[email protected]>
Signed-off-by: hfadzxy <[email protected]>
Co-authored-by: wangli <[email protected]>
Co-authored-by: hfadzxy <[email protected]>
Signed-off-by: Che Ruan <[email protected]>
Mercykid-bash pushed a commit to Mercykid-bash/vllm-ascend that referenced this pull request Dec 4, 2025
1. fix vllm-project/vllm#28542
The model structure modifications we involved in are:
     - Qwen2.5-VL(still exist some patch)
     - Qwen2-VL
     - Qwen2
     - DeepSeek series
     - Qwen-moe series
2. fix vllm-project/vllm#29121
   the output token now  type changed from np to `list[list[int]]`

3. fix vllm-project/vllm#29262
    `xformers` backend for multimodal now has been deprecated
4. fix vllm-project/vllm#29342

5. fix vllm-project/vllm#28579
6. fix vllm-project/vllm#28718
7. fix vllm-project/vllm#28665
8. fix vllm-project/vllm#26847
vllm introduced the `optimization-level`, some default config has been
changed, and the param `--enforce-eager` has been deprecated
9. fix http://github.com/vllm-project/vllm/pull/29223 it retuns tuple
for sampler.
10. fix vllm-project/vllm#29471 we'll remove the
related patch to avoid this kind of error.

Co-authored-by: hfadzxy <[email protected]>
Co-authored-by: wangli <[email protected]>

- vLLM version: v0.11.2

---------

Signed-off-by: wangxiyuan <[email protected]>
Signed-off-by: wangli <[email protected]>
Signed-off-by: hfadzxy <[email protected]>
Co-authored-by: wangli <[email protected]>
Co-authored-by: hfadzxy <[email protected]>
Signed-off-by: Che Ruan <[email protected]>
charlotte12l pushed a commit to charlotte12l/vllm that referenced this pull request Dec 5, 2025
Meihan-chen pushed a commit to Meihan-chen/vllm-ascend that referenced this pull request Dec 5, 2025
1. fix vllm-project/vllm#28542
The model structure modifications we involved in are:
     - Qwen2.5-VL(still exist some patch)
     - Qwen2-VL
     - Qwen2
     - DeepSeek series
     - Qwen-moe series
2. fix vllm-project/vllm#29121
   the output token now  type changed from np to `list[list[int]]`

3. fix vllm-project/vllm#29262
    `xformers` backend for multimodal now has been deprecated
4. fix vllm-project/vllm#29342

5. fix vllm-project/vllm#28579
6. fix vllm-project/vllm#28718
7. fix vllm-project/vllm#28665
8. fix vllm-project/vllm#26847
vllm introduced the `optimization-level`, some default config has been
changed, and the param `--enforce-eager` has been deprecated
9. fix http://github.com/vllm-project/vllm/pull/29223 it retuns tuple
for sampler.
10. fix vllm-project/vllm#29471 we'll remove the
related patch to avoid this kind of error.

Co-authored-by: hfadzxy <[email protected]>
Co-authored-by: wangli <[email protected]>


- vLLM version: v0.11.2

---------

Signed-off-by: wangxiyuan <[email protected]>
Signed-off-by: wangli <[email protected]>
Signed-off-by: hfadzxy <[email protected]>
Co-authored-by: wangli <[email protected]>
Co-authored-by: hfadzxy <[email protected]>
Zhathw pushed a commit to Zhathw/vllm that referenced this pull request Dec 6, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ci/build deepseek Related to DeepSeek models documentation Improvements or additions to documentation gpt-oss Related to GPT-OSS models llama Related to Llama models performance Performance-related issues qwen Related to Qwen models ready ONLY add when PR is ready to merge/full CI is needed speculative-decoding

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

2 participants