Skip to content

Commit f32e54d

Browse files
authored
[docker] feat: Upgrade sglang 0.4.9 + transformers 4.53.2 (#2794)
### What does this PR do? feat: Upgrade sglang 0.4.9 + transformers 4.53.2 ### Checklist Before Starting - [x] Search for similar PRs. Paste at least one query link here: ... - [x] Format the PR title as `[{modules}] {type}: {description}` (This will be checked by the CI) - `{modules}` include `fsdp`, `megatron`, `sglang`, `vllm`, `rollout`, `trainer`, `ci`, `training_utils`, `recipe`, `hardware`, `deployment`, `ray`, `worker`, `single_controller`, `misc`, `perf`, `model`, `algo`, `env`, `tool`, `ckpt`, `doc`, `data` - If this PR involves multiple modules, separate them with `,` like `[megatron, fsdp, doc]` - `{type}` is in `feat`, `fix`, `refactor`, `chore`, `test` - If this PR breaks any API (CLI arguments, config, function signature, etc.), add `[BREAKING]` to the beginning of the title. - Example: `[BREAKING][fsdp, megatron] feat: dynamic batching` ### Test > For changes that can not be tested by CI (e.g., algorithm implementation, new model support), validate by experiment(s) and show results like training curve plots, evaluation results, etc. ### API and Usage Example > Demonstrate how the API changes if any, and provide usage example(s) if possible. ```python # Add code snippet or script demonstrating how to use this ``` ### Design & Code Changes > Demonstrate the high-level design if this PR is complex, and list the specific changes. ### Checklist Before Submitting > [!IMPORTANT] > Please check all the following items before requesting a review, otherwise the reviewer might deprioritize this PR for review. - [ ] Read the [Contribute Guide](https://github.com/volcengine/verl/blob/main/CONTRIBUTING.md). - [ ] Apply [pre-commit checks](https://github.com/volcengine/verl/blob/main/CONTRIBUTING.md#code-linting-and-formatting): `pre-commit install && pre-commit run --all-files --show-diff-on-failure --color=always` - [ ] Add / Update [the documentation](https://github.com/volcengine/verl/tree/main/docs). - [ ] Add unit or end-to-end test(s) to [the CI workflow](https://github.com/volcengine/verl/tree/main/.github/workflows) to cover all the code. If not feasible, explain why: ... - [ ] Once your PR is ready for CI, send a message in [the `ci-request` channel](https://verl-project.slack.com/archives/C091TCESWB1) in [the `verl` Slack workspace](https://join.slack.com/t/verl-project/shared_invite/zt-3855yhg8g-CTkqXu~hKojPCmo7k_yXTQ). (If not accessible, please try [the Feishu group (飞书群)](https://applink.larkoffice.com/client/chat/chatter/add_by_link?link_token=772jd4f1-cd91-441e-a820-498c6614126a).)
1 parent a479fc8 commit f32e54d

28 files changed

+379
-185
lines changed
Lines changed: 133 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,133 @@
1+
name: e2e_ppo_trainer_deprecate
2+
3+
on:
4+
# Trigger the workflow on push or pull request,
5+
# but only for the main branch
6+
# For push, for now only anti-patterns are specified so it is more conservative
7+
# and achieves higher coverage.
8+
push:
9+
branches:
10+
- disabled_ci
11+
pull_request:
12+
branches:
13+
- disabled_ci
14+
paths:
15+
- "**/*.py"
16+
# Other entrypoints
17+
- "!**/*.md"
18+
- "!docker/**"
19+
- "!examples/**"
20+
- "!tests/**"
21+
- "!verl/trainer/main_*.py"
22+
- "!verl/trainer/fsdp_sft_trainer.py"
23+
# Docs
24+
- "!docs/**"
25+
# Recipes
26+
- "!recipe/**"
27+
# Megatron
28+
- "!verl/workers/**/megatron_*.py"
29+
# Entrypoints
30+
- ".github/workflows/e2e_ppo_trainer.yml"
31+
- "examples/data_preprocess/gsm8k.py"
32+
- "examples/data_preprocess/geo3k.py"
33+
- "tests/special_e2e/ppo_trainer"
34+
- "verl/trainer/main_ppo.py"
35+
- "verl/trainer/config/ppo_trainer.yaml"
36+
37+
# Cancel jobs on the same ref if a new one is triggered
38+
concurrency:
39+
group: ${{ github.workflow }}-${{ github.ref }}
40+
cancel-in-progress: ${{ github.ref != 'refs/heads/main' }}
41+
42+
# Declare permissions just read content.
43+
permissions:
44+
contents: read
45+
46+
jobs:
47+
pre_commit_for_ppo:
48+
runs-on: ubuntu-latest
49+
strategy:
50+
matrix:
51+
python-version: ["3.12"]
52+
steps:
53+
- uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2
54+
- name: Set up Python ${{ matrix.python-version }}
55+
uses: actions/setup-python@0b93645e9fea7318ecaed2b359559ac225c90a2b # v5.3.0
56+
with:
57+
python-version: ${{ matrix.python-version }}
58+
- name: Install the current repository
59+
run: |
60+
pip install -e .
61+
- name: Set ruff --output-format=github
62+
run: |
63+
sed -i 's/--output-format=full/--output-format=github/' .pre-commit-config.yaml
64+
git add .pre-commit-config.yaml
65+
- uses: pre-commit/[email protected]
66+
with:
67+
extra_args: "" # Overriding default "--all-files"
68+
69+
e2e_ppo_trainer_sglang_multiturn_with_tool:
70+
runs-on: [L20x8]
71+
needs: pre_commit_for_ppo
72+
timeout-minutes: 40 # Increase this timeout value as needed
73+
env:
74+
HTTP_PROXY: ${{ secrets.PROXY_HTTP }}
75+
HTTPS_PROXY: ${{ secrets.PROXY_HTTPS }}
76+
NO_PROXY: "localhost,127.0.0.1,hf-mirror.com"
77+
HF_ENDPOINT: "https://hf-mirror.com"
78+
HF_HUB_ENABLE_HF_TRANSFER: "0" # This is more stable
79+
container:
80+
image: verlai/verl:app-verl0.5-sglang0.4.9.post4-mcore0.12.2-te2.2
81+
options: --gpus all --shm-size=10g
82+
steps:
83+
- uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2
84+
with:
85+
fetch-depth: 0
86+
- name: Install the current repository
87+
run: |
88+
pip3 install -e .[test,gpu,sglang] --no-deps
89+
- name: Prepare gsm8k dataset with tool
90+
run: |
91+
ray stop --force
92+
python3 examples/data_preprocess/gsm8k_multiturn_w_tool.py --local_dir $HOME/data/gsm8k_verl_sgl_multi_turn_preprocessed
93+
- name: Running GSM8K with tool E2E training tests on 8 L20 GPUs with rmpad using function rm and save ckpt with sglang
94+
run: |
95+
ray stop --force
96+
bash tests/special_e2e/run_gsm8k_fsdp_sgl_multiturn_w_tool.sh
97+
- name: Running GSM8K with tool E2E training tests with FSDP2
98+
run: |
99+
ray stop --force
100+
FSDP_STRATEGY=fsdp2 bash tests/special_e2e/run_gsm8k_fsdp_sgl_multiturn_w_tool.sh
101+
102+
e2e_ppo_trainer_sglang_vlm_multiturn_with_tool:
103+
runs-on: [L20x8]
104+
needs: pre_commit_for_ppo
105+
timeout-minutes: 40 # Increase this timeout value as needed
106+
env:
107+
HTTP_PROXY: ${{ secrets.PROXY_HTTP }}
108+
HTTPS_PROXY: ${{ secrets.PROXY_HTTPS }}
109+
NO_PROXY: "localhost,127.0.0.1,hf-mirror.com"
110+
HF_ENDPOINT: "https://hf-mirror.com"
111+
HF_HUB_ENABLE_HF_TRANSFER: "0" # This is more stable
112+
container:
113+
image: verlai/verl:app-verl0.5-sglang0.4.9.post4-mcore0.12.2-te2.2
114+
options: --gpus all --shm-size=10g
115+
steps:
116+
- uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2
117+
with:
118+
fetch-depth: 0
119+
- name: Install the current repository
120+
run: |
121+
pip3 install -e .[test,geo,gpu,sglang]
122+
- name: Prepare geo3k dataset with tool
123+
run: |
124+
ray stop --force
125+
python3 examples/data_preprocess/geo3k_multiturn_w_tool.py --local_dir $HOME/data/geo3k_verl_sgl_multi_turn_preprocessed
126+
- name: Running GEO3K with tool E2E training tests on 8 L20 GPUs with rmpad using function rm and save ckpt with sglang
127+
run: |
128+
ray stop --force
129+
bash tests/special_e2e/run_geo3k_fsdp_sgl_multiturn_w_tool.sh
130+
- name: Running GEO3K with tool E2E training tests with FSDP2
131+
run: |
132+
ray stop --force
133+
FSDP_STRATEGY=fsdp2 bash tests/special_e2e/run_geo3k_fsdp_sgl_multiturn_w_tool.sh
Lines changed: 155 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,155 @@
1+
# # Tests layout
2+
3+
# Each folder under tests/ corresponds to a test category for a sub-namespace in verl. For instance:
4+
# - `tests/trainer` for testing functionality related to `verl/trainer`
5+
# - `tests/models` for testing functionality related to `verl/models`
6+
# - ...
7+
8+
# There are a few folders with `special_` prefix, created for special purposes:
9+
# - `special_distributed`: unit tests that must run with multiple GPUs
10+
# - `special_e2e`: end-to-end tests with training/generation scripts
11+
# - `special_npu`: tests for NPUs
12+
# - `special_sanity`: a suite of quick sanity tests
13+
# - `special_standalone`: a set of test that are designed to run in dedicated environments
14+
15+
# Accelerators for tests
16+
# - By default tests are run with GPU available, except for the ones under `special_npu`, and any test script whose name ends with `on_cpu.py`.
17+
# - For test scripts with `on_cpu.py` name suffix would be tested on CPU resources in linux environment.
18+
19+
# # Workflow layout
20+
21+
# All CI tests are configured by yaml files in `.github/workflows/`. Here's an overview of all test configs:
22+
# 1. A list of always triggered CPU sanity tests: `check-pr-title.yml`, `secrets_scan.yml`, `check-pr-title,yml`, `pre-commit.yml`, `doc.yml`
23+
# 2. Some heavy multi-GPU unit tests, such as `model.yml`, `vllm.yml`, `sgl.yml`
24+
# 3. End-to-end tests: `e2e_*.yml`
25+
# 4. Unit tests
26+
# - `cpu_unit_tests.yml`, run pytest on all scripts with file name pattern `tests/**/test_*_on_cpu.py`
27+
# - `gpu_unit_tests.yml`, run pytest on all scripts with file without the `on_cpu.py` suffix.
28+
# - Since cpu/gpu unit tests by default runs all tests under `tests`, please make sure tests are manually excluded in them when
29+
# - new workflow yaml is added to `.github/workflows`
30+
# - new tests are added to workflow mentioned in 2.
31+
32+
name: e2e_ppo_trainer_megatron_sglang_deprecate
33+
34+
on:
35+
# Trigger the workflow on push or pull request,
36+
# but only for the main branch.
37+
# For push, for now only anti-patterns are specified so it is more conservative
38+
# and achieves higher coverage.
39+
push:
40+
branches:
41+
- disabled_ci
42+
pull_request:
43+
branches:
44+
- disabled_ci
45+
paths:
46+
- "**/*.py"
47+
# Other entrypoints
48+
- "!docker/**"
49+
# Docs
50+
- "!**/*.md"
51+
- "!docs/**"
52+
- "!examples/**"
53+
- "!tests/**"
54+
- "!verl/trainer/main_*.py"
55+
- "!verl/trainer/fsdp_sft_trainer.py"
56+
# Recipes
57+
- "!recipe/**"
58+
# FSDP
59+
- "!verl/workers/**/*dp_*.py"
60+
# Entrypoints
61+
- ".github/workflows/e2e_ppo_trainer_megatron_sglang.yml"
62+
- "examples/data_preprocess/gsm8k.py"
63+
- "examples/data_preprocess/geo3k.py"
64+
- "tests/special_e2e/run_ppo_trainer_megatron.sh"
65+
- "verl/trainer/main_ppo.py"
66+
- "verl/trainer/config/ppo_megatron_trainer.yaml"
67+
68+
# Cancel jobs on the same ref if a new one is triggered
69+
concurrency:
70+
group: ${{ github.workflow }}-${{ github.ref }}
71+
cancel-in-progress: ${{ github.ref != 'refs/heads/main' }}
72+
73+
# Declare permissions just read content.
74+
permissions:
75+
contents: read
76+
77+
env:
78+
IMAGE: "verl-ci-cn-beijing.cr.volces.com/verlai/verl:app-verl0.5-sglang0.4.9.post4-mcore0.12.2-te2.2"
79+
DYNAMIC_RUNNER_ENDPOINT: "https://sd10g3clalm04ug7alq90.apigateway-cn-beijing.volceapi.com/runner"
80+
81+
jobs:
82+
setup:
83+
if: github.repository_owner == 'volcengine'
84+
runs-on: ubuntu-latest
85+
outputs:
86+
runner-label: ${{ steps.create-runner.outputs.runner-label }}
87+
mlp-task-id: ${{ steps.create-runner.outputs.mlp-task-id }}
88+
steps:
89+
- uses: actions/checkout@v4
90+
- id: create-runner
91+
uses: volcengine/vemlp-github-runner@v1
92+
with:
93+
mode: "create"
94+
faas-url: "${{ env.DYNAMIC_RUNNER_ENDPOINT }}"
95+
mlp-image: "${{ env.IMAGE }}"
96+
97+
e2e_ppo_trainer_megatron-qwen3:
98+
needs: setup
99+
runs-on: ["${{ needs.setup.outputs.runner-label || 'L20x8' }}"]
100+
timeout-minutes: 60 # Increase this timeout value as needed
101+
env:
102+
HTTP_PROXY: ${{ secrets.PROXY_HTTP }}
103+
HTTPS_PROXY: ${{ secrets.PROXY_HTTPS }}
104+
NO_PROXY: "localhost,127.0.0.1,hf-mirror.com"
105+
HF_ENDPOINT: "https://hf-mirror.com"
106+
HF_HUB_ENABLE_HF_TRANSFER: "0" # This is more stable
107+
steps:
108+
- uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2
109+
with:
110+
fetch-depth: 0
111+
- name: Install the current repository
112+
run: |
113+
pip3 install --no-deps -e .[test]
114+
- name: Prepare GSM8K dataset
115+
run: |
116+
python3 examples/data_preprocess/gsm8k.py
117+
- name: Running GSM8K E2E training tests with 3D parallelism on 8 L20 GPUs with Megatron (Qwen3) with validation and saving
118+
run: |
119+
ray stop --force
120+
ENGINE=sglang ALL_OFFLOAD=True VAL_BEFORE_TRAIN=True TEST_FREQ=1 SAVE_FREQ=1 MODEL_ID=Qwen/Qwen3-0.6B bash tests/special_e2e/run_ppo_trainer_megatron.sh
121+
- name: Running GSM8K E2E training tests with 3D parallelism on 8 L20 GPUs with Megatron (Qwen3) testing learning rate scheduler
122+
run: |
123+
ray stop --force
124+
ENGINE=sglang LR_WARMUP_STEPS=1 TOTAL_TRAIN_STEPS=2 MODEL_ID=Qwen/Qwen3-0.6B bash tests/special_e2e/run_ppo_trainer_megatron.sh
125+
126+
- name: Test Megatron checkpoints merging function (Qwen3 Actor and Critic)
127+
run: |
128+
exp_name="qwen3-0.6b-megatron-gsm8k-minimal"
129+
python -m verl.model_merger test --backend megatron --tie-word-embedding --local_dir checkpoints/verl-test/${exp_name}/global_step_1/actor --test_hf_dir checkpoints/verl-test/${exp_name}/global_step_1/actor/huggingface
130+
python -m verl.model_merger test --backend megatron --is-value-model --local_dir checkpoints/verl-test/${exp_name}/global_step_1/critic --test_hf_dir checkpoints/verl-test/${exp_name}/global_step_1/critic/huggingface
131+
- name: clean up
132+
run: |
133+
rm -rf checkpoints
134+
135+
cleanup:
136+
runs-on: ubuntu-latest
137+
needs:
138+
[
139+
setup,
140+
e2e_ppo_trainer_megatron-deepseek,
141+
e2e_ppo_trainer_megatron-qwen3,
142+
e2e_ppo_trainer_megatron-different-train-infer-tp-qwen-tie-embedding,
143+
e2e_ppo_trainer_megatron-qwen-override-transformer-config,
144+
e2e_ppo_trainer_megatron-deepseek-override-transformer-config,
145+
e2e_ppo_trainer_megatron-moe-expert-parallel,
146+
e2e_ppo_trainer_megatron-qwen2_5vl-3b,
147+
]
148+
if: always()
149+
steps:
150+
- id: destroy-runner
151+
uses: volcengine/vemlp-github-runner@v1
152+
with:
153+
mode: "destroy"
154+
faas-url: "${{ env.DYNAMIC_RUNNER_ENDPOINT }}"
155+
mlp-task-id: "${{ needs.setup.outputs.mlp-task-id }}"

.github/workflows/disabled/e2e_prime.yml renamed to .github/workflows/.deprecate/e2e_prime.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
name: e2e_prime
1+
name: e2e_prime_deprecate
22

33
on:
44
# Trigger the workflow on push or pull request,

.github/workflows/checkpoint_converter.yml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -81,7 +81,7 @@ jobs:
8181
NO_PROXY: "localhost,127.0.0.1"
8282
HF_HUB_ENABLE_HF_TRANSFER: "0" # This is more stable
8383
container:
84-
image: verlai/verl:app-verl0.5-sglang0.4.8-mcore0.12.2-te2.2
84+
image: verlai/verl:app-verl0.5-sglang0.4.9.post4-mcore0.12.2-te2.2
8585
options: --gpus all --shm-size=10g
8686
steps:
8787
- uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2
@@ -116,7 +116,7 @@ jobs:
116116
HF_HUB_ENABLE_HF_TRANSFER: "0" # This is more stable
117117
HF_ENDPOINT: "https://hf-mirror.com"
118118
container:
119-
image: verlai/verl:app-verl0.5-sglang0.4.8-mcore0.12.2-te2.2
119+
image: verlai/verl:app-verl0.5-sglang0.4.9.post4-mcore0.12.2-te2.2
120120
options: --gpus all --shm-size=10g
121121
steps:
122122
- uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

.github/workflows/cpu_unit_tests.yml

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -77,10 +77,12 @@ jobs:
7777
run: |
7878
pip install -e .[test,prime,geo]
7979
pip install --upgrade "ray>=2.40.0" pillow
80-
- name: Running CPU unit tests
80+
- name: Download datasets
8181
run: |
82-
[ ! -d "$HOME/verl-data" ] && huggingface-cli download verl-team/gsm8k-v0.4.1 --repo-type dataset --local-dir ~/verl-data/gsm8k
82+
huggingface-cli download verl-team/gsm8k-v0.4.1 --repo-type dataset --local-dir ~/verl-data/gsm8k
8383
python3 examples/data_preprocess/geo3k.py
84+
- name: Running CPU unit tests
85+
run: |
8486
echo '[pytest]' > pytest.ini
8587
echo 'python_files = *_on_cpu.py' >> pytest.ini
8688
pytest -s -x --asyncio-mode=auto tests/

.github/workflows/e2e_genrm_remote.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -87,7 +87,7 @@ jobs:
8787
HF_ENDPOINT: "https://hf-mirror.com"
8888
HF_HUB_ENABLE_HF_TRANSFER: "0" # This is more stable
8989
container:
90-
image: whatcanyousee/verl:ngc-cu124-vllm0.8.5-sglang0.4.6.post5-mcore0.12.0-te2.3
90+
image: verlai/verl:app-verl0.5-vllm0.9.1-mcore0.12.2-te2.2
9191
options: --gpus all --shm-size=10g
9292
steps:
9393
- uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

0 commit comments

Comments
 (0)