-
Notifications
You must be signed in to change notification settings - Fork 2.4k
[ci] refactor: add ci test for refactored reward worker and add some args to GenRM config #3385
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
15 commits
Select commit
Hold shift + click to select a range
71765ee
update
yyDing1 a40b51b
add ci test
yyDing1 52b8850
update
yyDing1 08e7db7
Update tests/workers/reward_model/test_reward_model.py
yyDing1 2fed7e4
fix
yyDing1 2238f2a
fix
yyDing1 5301eeb
fix
yyDing1 a67cea5
fix
yyDing1 59cf231
Merge branch 'volcengine:main' into rm-ci
yyDing1 5165ddf
update
yyDing1 b4d103e
restore other files and debug ci only
yyDing1 01ec018
restore other files and debug ci only
yyDing1 9d62ca9
restore other files and debug ci only
yyDing1 93a3b7d
fix
yyDing1 6a0d181
Update reward_model.yml
yyDing1 File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,90 @@ | ||
| # # Tests layout | ||
|
|
||
| # Each folder under tests/ corresponds to a test category for a sub-namespace in verl. For instance: | ||
| # - `tests/trainer` for testing functionality related to `verl/trainer` | ||
| # - `tests/models` for testing functionality related to `verl/models` | ||
| # - ... | ||
|
|
||
| # There are a few folders with `special_` prefix, created for special purposes: | ||
| # - `special_distributed`: unit tests that must run with multiple GPUs | ||
| # - `special_e2e`: end-to-end tests with training/generation scripts | ||
| # - `special_npu`: tests for NPUs | ||
| # - `special_sanity`: a suite of quick sanity tests | ||
| # - `special_standalone`: a set of test that are designed to run in dedicated environments | ||
|
|
||
| # Accelerators for tests | ||
| # - By default tests are run with GPU available, except for the ones under `special_npu`, and any test script whose name ends with `on_cpu.py`. | ||
| # - For test scripts with `on_cpu.py` name suffix would be tested on CPU resources in linux environment. | ||
|
|
||
| # # Workflow layout | ||
|
|
||
| # All CI tests are configured by yaml files in `.github/workflows/`. Here's an overview of all test configs: | ||
| # 1. A list of always triggered CPU sanity tests: `check-pr-title.yml`, `secrets_scan.yml`, `check-pr-title,yml`, `pre-commit.yml`, `doc.yml` | ||
| # 2. Some heavy multi-GPU unit tests, such as `model.yml`, `vllm.yml`, `sgl.yml` | ||
| # 3. End-to-end tests: `e2e_*.yml` | ||
| # 4. Unit tests | ||
| # - `cpu_unit_tests.yml`, run pytest on all scripts with file name pattern `tests/**/test_*_on_cpu.py` | ||
| # - `gpu_unit_tests.yml`, run pytest on all scripts with file without the `on_cpu.py` suffix. | ||
| # - Since cpu/gpu unit tests by default runs all tests under `tests`, please make sure tests are manually excluded in them when | ||
| # - new workflow yaml is added to `.github/workflows` | ||
| # - new tests are added to workflow mentioned in 2. | ||
| # name: Check PR Title | ||
|
|
||
| name: reward_model | ||
|
|
||
| on: | ||
| # Trigger the workflow on push or pull request, | ||
| # but only for the main branch | ||
| push: | ||
| branches: | ||
| - main | ||
| - v0.* | ||
| pull_request: | ||
| branches: | ||
| - main | ||
| - v0.* | ||
| paths: | ||
| - "verl/**/*.py" | ||
| # Entrypoints | ||
| - ".github/workflows/reward_model.yml" | ||
| - "tests/workers/reward_model/**" | ||
|
|
||
| # Declare permissions just read content. | ||
| permissions: | ||
| contents: read | ||
|
|
||
| # Cancel jobs on the same ref if a new one is triggered | ||
| concurrency: | ||
| group: ${{ github.workflow }}-${{ github.ref }} | ||
| cancel-in-progress: ${{ github.ref != 'refs/heads/main' }} | ||
|
|
||
| jobs: | ||
| discriminative_reward_model: | ||
| runs-on: [L20x8] | ||
| timeout-minutes: 20 # Increase this timeout value as needed | ||
| env: | ||
| HTTP_PROXY: ${{ secrets.PROXY_HTTP }} | ||
| HTTPS_PROXY: ${{ secrets.PROXY_HTTPS }} | ||
| NO_PROXY: "localhost,127.0.0.1,hf-mirror.com" | ||
| HF_ENDPOINT: "https://hf-mirror.com" | ||
| HF_HUB_ENABLE_HF_TRANSFER: "0" # This is more stable | ||
| SGL_DISABLE_TP_MEMORY_INBALANCE_CHECK: "True" | ||
| NCCL_SHM_DISABLE: "1" | ||
| NCCL_P2P_DISABLE: "1" | ||
| container: | ||
| image: verlai/verl:app-verl0.5-transformers4.55.4-sglang0.4.10.post2-mcore0.13.0-te2.2 | ||
| options: --gpus all --shm-size=10g | ||
| steps: | ||
| - uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2 | ||
| with: | ||
| fetch-depth: 0 | ||
| - name: Install the current repository | ||
| run: | | ||
| pip3 install -e .[test] | ||
| - name: Download model config files | ||
| run: | | ||
| hf download Skywork/Skywork-Reward-V2-Llama-3.2-1B --local-dir $HOME/models/Skywork/Skywork-Reward-V2-Llama-3.2-1B | ||
| - name: Running discriminative reward model tests on 8 L20 GPUs | ||
| run: | | ||
| unset http_proxy https_proxy HTTP_PROXY HTTPS_PROXY | ||
| pytest -s -x tests/workers/reward_model/test_reward_model.py |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hardcoding 8 GPUs for a CI test might make it flaky if the CI environment doesn't have that many GPUs available. Since
tensor_model_parallel_sizeis 2, using 2 GPUs should be sufficient and more robust for a CI environment.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
data parallel with multiple server instances should be tested, where more gpus are needed.