Skip to content

Conversation

@eric-haibin-lin
Copy link
Collaborator

What does this PR do?

  • Added __init__() method to initialize self.role_worker_mapping = {}
  • Extracted worker setup logic into dedicated methods:
    • add_actor_rollout_worker() - handles strategy-specific worker imports and setup (lines 130-153)
    • add_critic_worker() - sets up critic worker role mapping (lines 170-176)
    • init_resource_pool_mgr() - creates resource pool specifications (lines 178-187)
    • add_reward_model_worker() - conditionally adds reward model workers (lines 195-203)
    • add_ref_policy_worker() - conditionally adds reference policy workers (lines 205-208)

Checklist Before Starting

  • Search for similar PRs. Paste at least one query link here: ...
  • Format the PR title as [{modules}] {type}: {description} (This will be checked by the CI)
    • {modules} include fsdp, megatron, sglang, vllm, rollout, trainer, ci, training_utils, recipe, hardware, deployment, ray, worker, single_controller, misc, perf, model, algo, env, tool, ckpt, doc, data
    • If this PR involves multiple modules, separate them with , like [megatron, fsdp, doc]
    • {type} is in feat, fix, refactor, chore, test
    • If this PR breaks any API (CLI arguments, config, function signature, etc.), add [BREAKING] to the beginning of the title.
    • Example: [BREAKING][fsdp, megatron] feat: dynamic batching

Test

relying on existing unit tests

Checklist Before Submitting

Important

Please check all the following items before requesting a review, otherwise the reviewer might deprioritize this PR for review.

devin-ai-integration bot and others added 5 commits August 3, 2025 18:03
- Add __init__ method to initialize self.role_worker_mapping
- Extract add_actor_rollout_worker and add_critic_worker methods
- Extract add_ref_policy_worker and add_reward_model_worker methods
- Extract init_resource_pool_mgr method
- Maintain same logic flow and variable dependencies

Co-Authored-By: H <[email protected]>
- add_actor_rollout_worker no longer returns CriticWorker
- add_critic_worker now imports CriticWorker based on config strategies
- Proper separation of concerns between actor and critic worker setup

Co-Authored-By: H <[email protected]>
…meters

- Add self.mapping initialization in __init__ with documentation
- Remove mapping parameter from add_reward_model_worker and add_ref_policy_worker
- Update method calls to use self.mapping instead of passing mapping argument
- Improve encapsulation by making mapping an instance variable

Co-Authored-By: H <[email protected]>
- Fix F821 undefined name 'ref_policy' error in add_ref_policy_worker method
- Change ref_policy to ref_policy_cls to match the parameter name
- Resolves pre-commit (3.12) CI failure

Co-Authored-By: H <[email protected]>
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request refactors the TaskRunner class in verl/trainer/main_ppo.py to improve modularity by extracting worker setup logic into separate methods. The changes are well-structured and enhance code organization. However, I've identified a critical bug in the init_resource_pool_mgr method where self.mapping is incorrectly re-assigned, which would discard previously set mappings for reward and reference policy workers, leading to a KeyError during runtime. A fix is provided to address this issue.

@vermouth1992 vermouth1992 merged commit 01b4a29 into volcengine:main Aug 8, 2025
35 checks passed
@eric-haibin-lin eric-haibin-lin deleted the devin/1754244044-refactor-taskrunner-methods branch August 8, 2025 16:24
yellowbee686 pushed a commit to yellowbee686/verl that referenced this pull request Aug 11, 2025
…#2885)

### What does this PR do?

- Added `__init__()` method to initialize `self.role_worker_mapping =
{}`
- Extracted worker setup logic into dedicated methods:
- `add_actor_rollout_worker()` - handles strategy-specific worker
imports and setup (lines 130-153)
- `add_critic_worker()` - sets up critic worker role mapping (lines
170-176)
- `init_resource_pool_mgr()` - creates resource pool specifications
(lines 178-187)
- `add_reward_model_worker()` - conditionally adds reward model workers
(lines 195-203)
- `add_ref_policy_worker()` - conditionally adds reference policy
workers (lines 205-208)

### Checklist Before Starting

- [x] Search for similar PRs. Paste at least one query link here: ...
- [x] Format the PR title as `[{modules}] {type}: {description}` (This
will be checked by the CI)
- `{modules}` include `fsdp`, `megatron`, `sglang`, `vllm`, `rollout`,
`trainer`, `ci`, `training_utils`, `recipe`, `hardware`, `deployment`,
`ray`, `worker`, `single_controller`, `misc`, `perf`, `model`, `algo`,
`env`, `tool`, `ckpt`, `doc`, `data`
- If this PR involves multiple modules, separate them with `,` like
`[megatron, fsdp, doc]`
  - `{type}` is in `feat`, `fix`, `refactor`, `chore`, `test`
- If this PR breaks any API (CLI arguments, config, function signature,
etc.), add `[BREAKING]` to the beginning of the title.
  - Example: `[BREAKING][fsdp, megatron] feat: dynamic batching`

### Test

relying on existing unit tests


### Checklist Before Submitting

> [!IMPORTANT]
> Please check all the following items before requesting a review,
otherwise the reviewer might deprioritize this PR for review.

- [x] Read the [Contribute
Guide](https://github.com/volcengine/verl/blob/main/CONTRIBUTING.md).
- [ ] Apply [pre-commit
checks](https://github.com/volcengine/verl/blob/main/CONTRIBUTING.md#code-linting-and-formatting):
`pre-commit install && pre-commit run --all-files --show-diff-on-failure
--color=always`
- [ ] Add / Update [the
documentation](https://github.com/volcengine/verl/tree/main/docs).
- [ ] Add unit or end-to-end test(s) to [the CI
workflow](https://github.com/volcengine/verl/tree/main/.github/workflows)
to cover all the code. If not feasible, explain why: ...
- [ ] Once your PR is ready for CI, send a message in [the `ci-request`
channel](https://verl-project.slack.com/archives/C091TCESWB1) in [the
`verl` Slack
workspace](https://join.slack.com/t/verl-project/shared_invite/zt-3855yhg8g-CTkqXu~hKojPCmo7k_yXTQ).
(If not accessible, please try [the Feishu group
(飞书群)](https://applink.larkoffice.com/client/chat/chatter/add_by_link?link_token=772jd4f1-cd91-441e-a820-498c6614126a).)

---------

Co-authored-by: Devin AI <158243242+devin-ai-integration[bot]@users.noreply.github.com>
ChangyiYang pushed a commit to SwordFaith/verl that referenced this pull request Aug 16, 2025
…#2885)

### What does this PR do?

- Added `__init__()` method to initialize `self.role_worker_mapping =
{}`
- Extracted worker setup logic into dedicated methods:
- `add_actor_rollout_worker()` - handles strategy-specific worker
imports and setup (lines 130-153)
- `add_critic_worker()` - sets up critic worker role mapping (lines
170-176)
- `init_resource_pool_mgr()` - creates resource pool specifications
(lines 178-187)
- `add_reward_model_worker()` - conditionally adds reward model workers
(lines 195-203)
- `add_ref_policy_worker()` - conditionally adds reference policy
workers (lines 205-208)

### Checklist Before Starting

- [x] Search for similar PRs. Paste at least one query link here: ...
- [x] Format the PR title as `[{modules}] {type}: {description}` (This
will be checked by the CI)
- `{modules}` include `fsdp`, `megatron`, `sglang`, `vllm`, `rollout`,
`trainer`, `ci`, `training_utils`, `recipe`, `hardware`, `deployment`,
`ray`, `worker`, `single_controller`, `misc`, `perf`, `model`, `algo`,
`env`, `tool`, `ckpt`, `doc`, `data`
- If this PR involves multiple modules, separate them with `,` like
`[megatron, fsdp, doc]`
  - `{type}` is in `feat`, `fix`, `refactor`, `chore`, `test`
- If this PR breaks any API (CLI arguments, config, function signature,
etc.), add `[BREAKING]` to the beginning of the title.
  - Example: `[BREAKING][fsdp, megatron] feat: dynamic batching`

### Test

relying on existing unit tests


### Checklist Before Submitting

> [!IMPORTANT]
> Please check all the following items before requesting a review,
otherwise the reviewer might deprioritize this PR for review.

- [x] Read the [Contribute
Guide](https://github.com/volcengine/verl/blob/main/CONTRIBUTING.md).
- [ ] Apply [pre-commit
checks](https://github.com/volcengine/verl/blob/main/CONTRIBUTING.md#code-linting-and-formatting):
`pre-commit install && pre-commit run --all-files --show-diff-on-failure
--color=always`
- [ ] Add / Update [the
documentation](https://github.com/volcengine/verl/tree/main/docs).
- [ ] Add unit or end-to-end test(s) to [the CI
workflow](https://github.com/volcengine/verl/tree/main/.github/workflows)
to cover all the code. If not feasible, explain why: ...
- [ ] Once your PR is ready for CI, send a message in [the `ci-request`
channel](https://verl-project.slack.com/archives/C091TCESWB1) in [the
`verl` Slack
workspace](https://join.slack.com/t/verl-project/shared_invite/zt-3855yhg8g-CTkqXu~hKojPCmo7k_yXTQ).
(If not accessible, please try [the Feishu group
(飞书群)](https://applink.larkoffice.com/client/chat/chatter/add_by_link?link_token=772jd4f1-cd91-441e-a820-498c6614126a).)

---------

Co-authored-by: Devin AI <158243242+devin-ai-integration[bot]@users.noreply.github.com>
whatadayG pushed a commit to whatadayG/verl that referenced this pull request Sep 5, 2025
…#2885)

### What does this PR do?

- Added `__init__()` method to initialize `self.role_worker_mapping =
{}`
- Extracted worker setup logic into dedicated methods:
- `add_actor_rollout_worker()` - handles strategy-specific worker
imports and setup (lines 130-153)
- `add_critic_worker()` - sets up critic worker role mapping (lines
170-176)
- `init_resource_pool_mgr()` - creates resource pool specifications
(lines 178-187)
- `add_reward_model_worker()` - conditionally adds reward model workers
(lines 195-203)
- `add_ref_policy_worker()` - conditionally adds reference policy
workers (lines 205-208)

### Checklist Before Starting

- [x] Search for similar PRs. Paste at least one query link here: ...
- [x] Format the PR title as `[{modules}] {type}: {description}` (This
will be checked by the CI)
- `{modules}` include `fsdp`, `megatron`, `sglang`, `vllm`, `rollout`,
`trainer`, `ci`, `training_utils`, `recipe`, `hardware`, `deployment`,
`ray`, `worker`, `single_controller`, `misc`, `perf`, `model`, `algo`,
`env`, `tool`, `ckpt`, `doc`, `data`
- If this PR involves multiple modules, separate them with `,` like
`[megatron, fsdp, doc]`
  - `{type}` is in `feat`, `fix`, `refactor`, `chore`, `test`
- If this PR breaks any API (CLI arguments, config, function signature,
etc.), add `[BREAKING]` to the beginning of the title.
  - Example: `[BREAKING][fsdp, megatron] feat: dynamic batching`

### Test

relying on existing unit tests


### Checklist Before Submitting

> [!IMPORTANT]
> Please check all the following items before requesting a review,
otherwise the reviewer might deprioritize this PR for review.

- [x] Read the [Contribute
Guide](https://github.com/volcengine/verl/blob/main/CONTRIBUTING.md).
- [ ] Apply [pre-commit
checks](https://github.com/volcengine/verl/blob/main/CONTRIBUTING.md#code-linting-and-formatting):
`pre-commit install && pre-commit run --all-files --show-diff-on-failure
--color=always`
- [ ] Add / Update [the
documentation](https://github.com/volcengine/verl/tree/main/docs).
- [ ] Add unit or end-to-end test(s) to [the CI
workflow](https://github.com/volcengine/verl/tree/main/.github/workflows)
to cover all the code. If not feasible, explain why: ...
- [ ] Once your PR is ready for CI, send a message in [the `ci-request`
channel](https://verl-project.slack.com/archives/C091TCESWB1) in [the
`verl` Slack
workspace](https://join.slack.com/t/verl-project/shared_invite/zt-3855yhg8g-CTkqXu~hKojPCmo7k_yXTQ).
(If not accessible, please try [the Feishu group
(飞书群)](https://applink.larkoffice.com/client/chat/chatter/add_by_link?link_token=772jd4f1-cd91-441e-a820-498c6614126a).)

---------

Co-authored-by: Devin AI <158243242+devin-ai-integration[bot]@users.noreply.github.com>
WncFht pushed a commit to WncFht/verl that referenced this pull request Oct 10, 2025
…#2885)

### What does this PR do?

- Added `__init__()` method to initialize `self.role_worker_mapping =
{}`
- Extracted worker setup logic into dedicated methods:
- `add_actor_rollout_worker()` - handles strategy-specific worker
imports and setup (lines 130-153)
- `add_critic_worker()` - sets up critic worker role mapping (lines
170-176)
- `init_resource_pool_mgr()` - creates resource pool specifications
(lines 178-187)
- `add_reward_model_worker()` - conditionally adds reward model workers
(lines 195-203)
- `add_ref_policy_worker()` - conditionally adds reference policy
workers (lines 205-208)

### Checklist Before Starting

- [x] Search for similar PRs. Paste at least one query link here: ...
- [x] Format the PR title as `[{modules}] {type}: {description}` (This
will be checked by the CI)
- `{modules}` include `fsdp`, `megatron`, `sglang`, `vllm`, `rollout`,
`trainer`, `ci`, `training_utils`, `recipe`, `hardware`, `deployment`,
`ray`, `worker`, `single_controller`, `misc`, `perf`, `model`, `algo`,
`env`, `tool`, `ckpt`, `doc`, `data`
- If this PR involves multiple modules, separate them with `,` like
`[megatron, fsdp, doc]`
  - `{type}` is in `feat`, `fix`, `refactor`, `chore`, `test`
- If this PR breaks any API (CLI arguments, config, function signature,
etc.), add `[BREAKING]` to the beginning of the title.
  - Example: `[BREAKING][fsdp, megatron] feat: dynamic batching`

### Test

relying on existing unit tests


### Checklist Before Submitting

> [!IMPORTANT]
> Please check all the following items before requesting a review,
otherwise the reviewer might deprioritize this PR for review.

- [x] Read the [Contribute
Guide](https://github.com/volcengine/verl/blob/main/CONTRIBUTING.md).
- [ ] Apply [pre-commit
checks](https://github.com/volcengine/verl/blob/main/CONTRIBUTING.md#code-linting-and-formatting):
`pre-commit install && pre-commit run --all-files --show-diff-on-failure
--color=always`
- [ ] Add / Update [the
documentation](https://github.com/volcengine/verl/tree/main/docs).
- [ ] Add unit or end-to-end test(s) to [the CI
workflow](https://github.com/volcengine/verl/tree/main/.github/workflows)
to cover all the code. If not feasible, explain why: ...
- [ ] Once your PR is ready for CI, send a message in [the `ci-request`
channel](https://verl-project.slack.com/archives/C091TCESWB1) in [the
`verl` Slack
workspace](https://join.slack.com/t/verl-project/shared_invite/zt-3855yhg8g-CTkqXu~hKojPCmo7k_yXTQ).
(If not accessible, please try [the Feishu group
(飞书群)](https://applink.larkoffice.com/client/chat/chatter/add_by_link?link_token=772jd4f1-cd91-441e-a820-498c6614126a).)

---------

Co-authored-by: Devin AI <158243242+devin-ai-integration[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants