Commit 7db80a0
authored
Sync midstream to upstream v0.8.5.post1 (opendatahub-io#218)
Syncing midstream NM fork to Upstream tag of
[v0.8.5.post1](https://github.com/vllm-project/vllm/tree/v0.8.5.post1) +
cherry pick of
vllm-project@be633fb
needed for benchmarks +
[CP](neuralmagic/nm-vllm-ent@1fe447d)
for compressed tensor bump +
[CP](vllm-project#17677) for lora on AMD +
[CP](vllm-project#17315) for llama4 w/ pure
dense layers
```
commit 31c73ba (HEAD -> upstream-v0.8.5, nm-fork/upstream-v0.8.5)
Author: Chauncey <[email protected]>
Date: Wed Apr 30 15:11:04 2025 +0800
[Bugfix] Fix AttributeError: 'State' object has no attribute 'engine_client' (vllm-project#17434)
Signed-off-by: chaunceyjiang <[email protected]>
commit f8db0bd
Author: Lucas Wilkinson <[email protected]>
Date: Fri May 2 14:01:38 2025 -0400
[BugFix][Attention] Fix sliding window attention in V1 giving incorrect results (vllm-project#17574)
Signed-off-by: Lucas Wilkinson <[email protected]>
commit e335c34
Author: Robert Shaw <[email protected]>
Date: Fri May 2 04:07:03 2025 -0400
[BugFix] Fix Memory Leak (vllm-project#17567)
Signed-off-by: [email protected] <[email protected]>
commit cc463fe
Merge: 1e358ff ba41cc9
Author: Selbi Nuryyeva <[email protected]>
Date: Tue Apr 29 12:34:57 2025 -0400
Merge branch 'tag-upstream-v0.8.5' into upstream-v0.8.5
commit ba41cc9 (tag: v0.8.5, tag-upstream-v0.8.5)
Author: Michael Goin <[email protected]>
Date: Mon Apr 28 16:20:24 2025 -0600
[Model] Add tuned triton fused_moe configs for Qwen3Moe (vllm-project#17328)
Signed-off-by: mgoin <[email protected]>
commit dcbac4c
Author: Simon Mo <[email protected]>
Date: Mon Apr 28 14:12:01 2025 -0700
[Model] Qwen3 Dense FP8 Compat Fixes (vllm-project#17318)
Signed-off-by: simon-mo <[email protected]>
[...]
```
Commands
```
git fetch upstream
git checkout -b upstream-v0.8.5
git merge upstream/v0.8.5
git cherry-pick be633fb
```
TEST PLAN
accept sync:
https://github.com/neuralmagic/nm-cicd/actions/runs/14841223552
related PR in cicd: neuralmagic/nm-cicd#99
release workflow:
https://github.com/neuralmagic/nm-cicd/actions/runs/14845693864File tree
694 files changed
+39423
-13292
lines changed- .buildkite
- lm-eval-harness
- configs
- scripts/hardware_ci
- .github
- ISSUE_TEMPLATE
- benchmarks
- kernels
- cmake/external_projects
- csrc
- attention
- mla
- moe
- marlin_moe_wna16
- quantization
- cutlass_w8a8
- moe
- fp4
- gptq_marlin
- rocm
- docker
- docs/source
- assets/deployment
- contributing/model
- deployment
- frameworks
- integrations
- design
- v1
- features
- quantization
- getting_started
- installation
- ai_accelerator
- cpu
- gpu
- models
- extensions
- serving
- examples
- lmcache
- disagg_prefill_lmcache_v1
- configs
- offline_inference
- basic
- disaggregated-prefill-v1
- qwen2_5_omni
- online_serving
- requirements
- tests
- benchmarks
- compile
- core/block/e2e
- distributed
- engine
- entrypoints
- llm
- openai
- correctness
- kernels
- attention
- core
- mamba
- moe
- quantization
- lora
- model_executor
- models
- decoder_only
- audio_language
- language
- vision_language
- vlm_utils
- embedding
- language
- encoder_decoder/vision_language
- multimodal/processing
- quantization
- samplers
- spec_decode
- tokenization
- tool_use
- v1
- core
- e2e
- engine
- entrypoints
- llm
- shutdown
- spec_decode
- structured_output
- tpu
- worker
- worker
- vllm
- assets
- attention
- backends
- mla
- ops
- utils
- benchmarks
- compilation
- core
- distributed
- device_communicators
- kv_transfer
- kv_connector
- v1
- kv_lookup_buffer
- kv_pipe
- engine
- multiprocessing
- output_processor
- entrypoints
- cli
- benchmark
- openai
- tool_parsers
- executor
- inputs
- lora
- ops/triton_ops
- model_executor
- guided_decoding
- layers
- fused_moe
- configs
- mamba/ops
- quantization
- compressed_tensors
- schemes
- kernels/mixed_precision
- quark
- utils
- model_loader
- models
- multimodal
- platforms
- spec_decode
- transformers_utils
- configs
- tokenizer_group
- tokenizers
- triton_utils
- usage
- v1
- attention/backends
- mla
- core
- sched
- engine
- executor
- metrics
- sample
- ops
- tpu
- spec_decode
- structured_output
- worker
- worker
Some content is hidden
Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.
694 files changed
+39423
-13292
lines changedLines changed: 1 addition & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
1 | 2 | | |
2 | 3 | | |
3 | 4 | | |
| |||
Lines changed: 1 addition & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
1 | 2 | | |
2 | 3 | | |
3 | 4 | | |
| |||
Lines changed: 1 addition & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
1 | 2 | | |
2 | 3 | | |
3 | 4 | | |
| |||
Lines changed: 1 addition & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
1 | 2 | | |
2 | 3 | | |
3 | 4 | | |
| |||
Lines changed: 1 addition & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
1 | 2 | | |
2 | 3 | | |
3 | 4 | | |
| |||
Lines changed: 1 addition & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
1 | 2 | | |
2 | 3 | | |
3 | 4 | | |
| |||
Lines changed: 1 addition & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
1 | 2 | | |
2 | 3 | | |
3 | 4 | | |
| |||
Lines changed: 1 addition & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
1 | 2 | | |
2 | 3 | | |
3 | 4 | | |
| |||
Lines changed: 1 addition & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
1 | 2 | | |
2 | 3 | | |
3 | 4 | | |
| |||
Lines changed: 1 addition & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
1 | 2 | | |
2 | 3 | | |
3 | 4 | | |
| |||
0 commit comments