Skip to content

Commit fc489db

Browse files
authored
[rollout] fix: add batch_data_id default value check in AsyncRolloutRequest (#3657)
### What does this PR do? This PR improves the robustness of the initialize_request method in verl/workers/rollout/schemas.py. When input_ids exceed max_prompt_len, if the batch_data_id field is missing from values, it will be automatically populated with the default value. This prevents errors during logging and enhances fault tolerance in data processing, making future extension and troubleshooting more convenient. ### Checklist Before Starting - [ ] Search for similar PRs. Paste at least one query link here: ... - [ ] Format the PR title as `[{modules}] {type}: {description}` (This will be checked by the CI) - `{modules}` include `fsdp`, `megatron`, `sglang`, `vllm`, `rollout`, `trainer`, `ci`, `training_utils`, `recipe`, `hardware`, `deployment`, `ray`, `worker`, `single_controller`, `misc`, `perf`, `model`, `algo`, `env`, `tool`, `ckpt`, `doc`, `data` - If this PR involves multiple modules, separate them with `,` like `[megatron, fsdp, doc]` - `{type}` is in `feat`, `fix`, `refactor`, `chore`, `test` - If this PR breaks any API (CLI arguments, config, function signature, etc.), add `[BREAKING]` to the beginning of the title. - Example: `[BREAKING][fsdp, megatron] feat: dynamic batching` ### Test > For changes that can not be tested by CI (e.g., algorithm implementation, new model support), validate by experiment(s) and show results like training curve plots, evaluation results, etc. ### Design & Code Changes > Demonstrate the high-level design if this PR is complex, and list the specific changes. ### Checklist Before Submitting > [!IMPORTANT] > Please check all the following items before requesting a review, otherwise the reviewer might deprioritize this PR for review. - [ ] Read the [Contribute Guide](https://github.com/volcengine/verl/blob/main/CONTRIBUTING.md). - [ ] Apply [pre-commit checks](https://github.com/volcengine/verl/blob/main/CONTRIBUTING.md#code-linting-and-formatting): `pre-commit install && pre-commit run --all-files --show-diff-on-failure --color=always` - [ ] Add / Update [the documentation](https://github.com/volcengine/verl/tree/main/docs). - [ ] Add unit or end-to-end test(s) to [the CI workflow](https://github.com/volcengine/verl/tree/main/.github/workflows) to cover all the code. If not feasible, explain why: ... - [ ] Once your PR is ready for CI, send a message in [the `ci-request` channel](https://verl-project.slack.com/archives/C091TCESWB1) in [the `verl` Slack workspace](https://join.slack.com/t/verl-project/shared_invite/zt-3855yhg8g-CTkqXu~hKojPCmo7k_yXTQ). (If not accessible, please try [the Feishu group (飞书群)](https://applink.larkoffice.com/client/chat/chatter/add_by_link?link_token=772jd4f1-cd91-441e-a820-498c6614126a).) Co-authored-by: yaopandeng <[email protected]>
1 parent d45d049 commit fc489db

File tree

1 file changed

+3
-0
lines changed

1 file changed

+3
-0
lines changed

verl/workers/rollout/schemas.py

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -180,6 +180,9 @@ def initialize_request(cls, values):
180180
if values["input_ids"].shape[-1] > max_prompt_len:
181181
# Only log the warning to avoid truncating in the middle of generation prompt. Consider raising an
182182
# error for this case in the future.
183+
# Ensure batch_data_id exists with default value if not provided
184+
if 'batch_data_id' not in values:
185+
values['batch_data_id'] = cls.model_fields['batch_data_id'].default
183186
logger.warning(
184187
f"Prompt {values['batch_data_id']} has length {values['input_ids'].shape[-1]} "
185188
f"which is greater than max_prompt_len {max_prompt_len} after applied chat template with tools."

0 commit comments

Comments
 (0)