Skip to content

Commit c9db39b

Browse files
kzawora-intelwallashssjikunshanggshtrasyoukaichao
authored
Rebase 2025.01.21 (#714)
- **[Bugfix] Fix score api for missing max_model_len validation (vllm-project#12119)** - **[Bugfix] Mistral tokenizer encode accept list of str (vllm-project#12149)** - **[AMD][FP8] Using MI300 FP8 format on ROCm for block_quant (vllm-project#12134)** - **[torch.compile] disable logging when cache is disabled (vllm-project#12043)** - **[misc] fix cross-node TP (vllm-project#12166)** - **[AMD][CI/Build][Bugfix] use pytorch stale wheel (vllm-project#12172)** - **[core] further polish memory profiling (vllm-project#12126)** - **[Docs] Fix broken link in SECURITY.md (vllm-project#12175)** - **[Model] Port deepseek-vl2 processor, remove dependency (vllm-project#12169)** - **[core] clean up executor class hierarchy between v1 and v0 (vllm-project#12171)** - **[Misc] Support register quantization method out-of-tree (vllm-project#11969)** - **[V1] Collect env var for usage stats (vllm-project#12115)** - **[BUGFIX] Move scores to float32 in case of running xgrammar on cpu (vllm-project#12152)** - **[Bugfix] Fix multi-modal processors for transformers 4.48 (vllm-project#12187)** - **[torch.compile] store inductor compiled Python file (vllm-project#12182)** - **benchmark_serving support --served-model-name param (vllm-project#12109)** - **[Misc] Add BNB support to GLM4-V model (vllm-project#12184)** - **[V1] Add V1 support of Qwen2-VL (vllm-project#12128)** - **[Model] Support for fairseq2 Llama (vllm-project#11442)** - **[Bugfix] Fix num_heads value for simple connector when tp enabled (vllm-project#12074)** - **[torch.compile] fix sym_tensor_indices (vllm-project#12191)** - **Move linting to `pre-commit` (vllm-project#11975)** - **[DOC] Fix typo in docstring and assert message (vllm-project#12194)** - **[DOC] Add missing docstring in LLMEngine.add_request() (vllm-project#12195)** - **[Bugfix] Fix incorrect types in LayerwiseProfileResults (vllm-project#12196)** - **[Model] Add Qwen2 PRM model support (vllm-project#12202)** - **[Core] Interface for accessing model from `VllmRunner` (vllm-project#10353)** - **[misc] add placeholder format.sh (vllm-project#12206)** - **[CI/Build] Remove dummy CI steps (vllm-project#12208)** - **[CI/Build] Make pre-commit faster (vllm-project#12212)** - **[Model] Upgrade Aria to transformers 4.48 (vllm-project#12203)** - **[misc] print a message to suggest how to bypass commit hooks (vllm-project#12217)** - **[core][bugfix] configure env var during import vllm (vllm-project#12209)** - **[V1] Remove `_get_cache_block_size` (vllm-project#12214)** - **[Misc] Pass `attention` to impl backend (vllm-project#12218)** - **[Bugfix] Fix `HfExampleModels.find_hf_info` (vllm-project#12223)** - **[CI] Pass local python version explicitly to pre-commit mypy.sh (vllm-project#12224)** - **[Misc] Update CODEOWNERS (vllm-project#12229)** - **fix: update platform detection for M-series arm based MacBook processors (vllm-project#12227)** - **[misc] add cuda runtime version to usage data (vllm-project#12190)** - **[bugfix] catch xgrammar unsupported array constraints (vllm-project#12210)** - **[Kernel] optimize moe_align_block_size for cuda graph and large num_experts (e.g. DeepSeek-V3) (vllm-project#12222)** - **Add quantization and guided decoding CODEOWNERS (vllm-project#12228)** - **[AMD][Build] Porting dockerfiles from the ROCm/vllm fork (vllm-project#11777)** - **[BugFix] Fix GGUF tp>1 when vocab_size is not divisible by 64 (vllm-project#12230)** - **[ci/build] disable failed and flaky tests (vllm-project#12240)** - **[Misc] Rename `MultiModalInputsV2 -> MultiModalInputs` (vllm-project#12244)** - **[Misc]Add BNB quantization for PaliGemmaForConditionalGeneration (vllm-project#12237)** - **[Misc] Remove redundant TypeVar from base model (vllm-project#12248)** - **[Bugfix] Fix mm_limits access for merged multi-modal processor (vllm-project#12252)** --------- Signed-off-by: Wallas Santos <[email protected]> Signed-off-by: Kunshang Ji <[email protected]> Signed-off-by: Gregory Shtrasberg <[email protected]> Signed-off-by: youkaichao <[email protected]> Signed-off-by: hongxyan <[email protected]> Signed-off-by: Russell Bryant <[email protected]> Signed-off-by: Isotr0py <[email protected]> Signed-off-by: Michal Adamczyk <[email protected]> Signed-off-by: zibai <[email protected]> Signed-off-by: Roger Wang <[email protected]> Signed-off-by: DarkLight1337 <[email protected]> Signed-off-by: Martin Gleize <[email protected]> Signed-off-by: Shangming Cai <[email protected]> Signed-off-by: Harry Mellor <[email protected]> Signed-off-by: Yuan Tang <[email protected]> Signed-off-by: Chen Zhang <[email protected]> Signed-off-by: wangxiyuan <[email protected]> Signed-off-by: isikhi <[email protected]> Signed-off-by: Jason Cheng <[email protected]> Signed-off-by: Jinzhen Lin <[email protected]> Signed-off-by: mgoin <[email protected]> Signed-off-by: NickLucche <[email protected]> Signed-off-by: Jee Jee Li <[email protected]> Signed-off-by: Konrad Zawora <[email protected]> Co-authored-by: Wallas Henrique <[email protected]> Co-authored-by: Kunshang Ji <[email protected]> Co-authored-by: Gregory Shtrasberg <[email protected]> Co-authored-by: youkaichao <[email protected]> Co-authored-by: Hongxia Yang <[email protected]> Co-authored-by: Russell Bryant <[email protected]> Co-authored-by: Isotr0py <[email protected]> Co-authored-by: yancong <[email protected]> Co-authored-by: Simon Mo <[email protected]> Co-authored-by: Michal Adamczyk <[email protected]> Co-authored-by: Cyrus Leung <[email protected]> Co-authored-by: gujing <[email protected]> Co-authored-by: Roger Wang <[email protected]> Co-authored-by: imkero <[email protected]> Co-authored-by: Martin Gleize <[email protected]> Co-authored-by: mgleize user <[email protected]> Co-authored-by: shangmingc <[email protected]> Co-authored-by: Harry Mellor <[email protected]> Co-authored-by: Yuan Tang <[email protected]> Co-authored-by: Chen Zhang <[email protected]> Co-authored-by: wangxiyuan <[email protected]> Co-authored-by: Işık <[email protected]> Co-authored-by: Roger Wang <[email protected]> Co-authored-by: Cheng Kuan Yong Jason <[email protected]> Co-authored-by: Jinzhen Lin <[email protected]> Co-authored-by: Michael Goin <[email protected]> Co-authored-by: Tyler Michael Smith <[email protected]> Co-authored-by: Michael Goin <[email protected]> Co-authored-by: Nicolò Lucchesi <[email protected]> Co-authored-by: Jee Jee Li <[email protected]>
1 parent 5424a93 commit c9db39b

File tree

160 files changed

+3693
-3559
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

160 files changed

+3693
-3559
lines changed

.buildkite/nightly-benchmarks/scripts/nightly-annotate.sh

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -43,7 +43,7 @@ main() {
4343

4444

4545

46-
# The figures should be genereated by a separate process outside the CI/CD pipeline
46+
# The figures should be generated by a separate process outside the CI/CD pipeline
4747

4848
# # generate figures
4949
# python3 -m pip install tabulate pandas matplotlib

.buildkite/test-pipeline.yaml

Lines changed: 6 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -52,7 +52,6 @@ steps:
5252
- tests/worker
5353
- tests/standalone_tests/lazy_torch_compile.py
5454
commands:
55-
- pip install git+https://github.com/Isotr0py/DeepSeek-VL2.git # Used by multimoda processing test
5655
- python3 standalone_tests/lazy_torch_compile.py
5756
- pytest -v -s mq_llm_engine # MQLLMEngine
5857
- pytest -v -s async_engine # AsyncLLMEngine
@@ -478,7 +477,9 @@ steps:
478477
- pytest models/encoder_decoder/language/test_bart.py -v -s -m 'distributed(num_gpus=2)'
479478
- pytest models/encoder_decoder/vision_language/test_broadcast.py -v -s -m 'distributed(num_gpus=2)'
480479
- pytest models/decoder_only/vision_language/test_models.py -v -s -m 'distributed(num_gpus=2)'
481-
- pytest -v -s spec_decode/e2e/test_integration_dist_tp2.py
480+
# this test fails consistently.
481+
# TODO: investigate and fix
482+
# - pytest -v -s spec_decode/e2e/test_integration_dist_tp2.py
482483
- CUDA_VISIBLE_DEVICES=0,1 pytest -v -s test_sharded_state_loader.py
483484
- CUDA_VISIBLE_DEVICES=0,1 pytest -v -s kv_transfer/disagg_test.py
484485

@@ -516,7 +517,9 @@ steps:
516517
- vllm/engine
517518
- tests/multi_step
518519
commands:
519-
- pytest -v -s multi_step/test_correctness_async_llm.py
520+
# this test is quite flaky
521+
# TODO: investigate and fix.
522+
# - pytest -v -s multi_step/test_correctness_async_llm.py
520523
- pytest -v -s multi_step/test_correctness_llm.py
521524

522525
- label: Pipeline Parallelism Test # 45min

.github/workflows/actionlint.yml

Lines changed: 0 additions & 40 deletions
This file was deleted.

.github/workflows/clang-format.yml

Lines changed: 0 additions & 53 deletions
This file was deleted.

.github/workflows/codespell.yml

Lines changed: 0 additions & 45 deletions
This file was deleted.

.github/workflows/doc-lint.yml

Lines changed: 0 additions & 32 deletions
This file was deleted.

.github/workflows/matchers/ruff.json

Lines changed: 0 additions & 17 deletions
This file was deleted.

.github/workflows/mypy.yaml

Lines changed: 0 additions & 51 deletions
This file was deleted.

.github/workflows/png-lint.yml

Lines changed: 0 additions & 37 deletions
This file was deleted.

.github/workflows/pre-commit.yml

Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,19 @@
1+
name: pre-commit
2+
3+
on:
4+
pull_request:
5+
push:
6+
branches: [main]
7+
8+
jobs:
9+
pre-commit:
10+
runs-on: ubuntu-latest
11+
steps:
12+
- uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2
13+
- uses: actions/setup-python@0b93645e9fea7318ecaed2b359559ac225c90a2b # v5.3.0
14+
with:
15+
python-version: "3.12"
16+
- run: echo "::add-matcher::.github/workflows/matchers/actionlint.json"
17+
- uses: pre-commit/action@2c7b3805fd2a0fd8c1884dcaebf91fc102a13ecd # v3.0.1
18+
with:
19+
extra_args: --hook-stage manual

0 commit comments

Comments
 (0)