[rollout] fix: qwen2_vl position_ids shape mismatch #3653

m-Just · 2025-09-30T17:01:48Z

What does this PR do?

Fix qwen2_vl position_id shape mismatch: verl/models/transformers/qwen2_vl.py:process_position_ids expects position_ids to have a shape of (4, batch_size, seq_length) but verl/experimental/agent_loop/agent_loop.py:generate_sequences returns (batch_size, 3, seq_length) (which will be transposed to (3, batch_size, seq_length)), ignoring the text dimension. This PR follows the relevant code in verl/utils/dataset/rl_dataset.py to fix the issue.

Checklist Before Starting

Search for similar PRs. Paste at least one query link here: ...
Format the PR title as [{modules}] {type}: {description} (This will be checked by the CI)
- {modules} include fsdp, megatron, sglang, vllm, rollout, trainer, ci, training_utils, recipe, hardware, deployment, ray, worker, single_controller, misc, perf, model, algo, env, tool, ckpt, doc, data
- If this PR involves multiple modules, separate them with , like [megatron, fsdp, doc]
- {type} is in feat, fix, refactor, chore, test
- If this PR breaks any API (CLI arguments, config, function signature, etc.), add [BREAKING] to the beginning of the title.
- Example: [BREAKING][fsdp, megatron] feat: dynamic batching

Test

For changes that can not be tested by CI (e.g., algorithm implementation, new model support), validate by experiment(s) and show results like training curve plots, evaluation results, etc.

API and Usage Example

Demonstrate how the API changes if any, and provide usage example(s) if possible.

# Add code snippet or script demonstrating how to use this

Design & Code Changes

Demonstrate the high-level design if this PR is complex, and list the specific changes.

Checklist Before Submitting

Important

Please check all the following items before requesting a review, otherwise the reviewer might deprioritize this PR for review.

Read the Contribute Guide.
Apply pre-commit checks: pre-commit install && pre-commit run --all-files --show-diff-on-failure --color=always
Add / Update the documentation. Not applicable.
Add unit or end-to-end test(s) to the CI workflow to cover all the code. If not feasible, explain why: ...
Once your PR is ready for CI, send a message in the ci-request channel in the verl Slack workspace. (If not accessible, please try the Feishu group (飞书群).)

gemini-code-assist

Code Review

This pull request aims to fix a shape mismatch for position_ids in qwen2_vl. While it correctly identifies that an additional dimension for text position IDs is needed, the resulting tensor shape is still incorrect for the model's expectations, which would lead to a runtime error. I've provided a critical comment with a suggested fix that corrects the tensor dimensions. This fix also requires a small change in the _postprocess method, which I've detailed in the comment.

verl/experimental/agent_loop/agent_loop.py

hiyouga

LGTM

huaiyizhao · 2025-10-10T03:29:36Z

It seems the "position_ids shape mismatch" problem is still there in training. #3647 (comment)

zhanxuejie · 2025-10-16T00:54:57Z

When using qwen2.5-vl to train GRPO on the Geo3K dataset, I still encounter this problem: “ position_ids should be a 3D tensor with shape (4, batch_size, seq_length) ”.

zhanxuejie · 2025-10-16T00:58:18Z

It seems the "position_ids shape mismatch" problem is still there in training. #3647 (comment)

Yes, the same error occurred during my training, and it seems that this issue has not been fixed. Have you solved this error?

### What does this PR do? > Fix qwen2_vl position_id shape mismatch: `verl/models/transformers/qwen2_vl.py:process_position_ids` expects `position_ids` to have a shape of `(4, batch_size, seq_length)` but `verl/experimental/agent_loop/agent_loop.py:generate_sequences` returns `(batch_size, 3, seq_length)` (which will be transposed to `(3, batch_size, seq_length)`), ignoring the text dimension. This PR follows the relevant code in `verl/utils/dataset/rl_dataset.py` to fix the issue. ### Checklist Before Starting - [x] Search for similar PRs. Paste at least one query link here: ... - [x] Format the PR title as `[{modules}] {type}: {description}` (This will be checked by the CI) - `{modules}` include `fsdp`, `megatron`, `sglang`, `vllm`, `rollout`, `trainer`, `ci`, `training_utils`, `recipe`, `hardware`, `deployment`, `ray`, `worker`, `single_controller`, `misc`, `perf`, `model`, `algo`, `env`, `tool`, `ckpt`, `doc`, `data` - If this PR involves multiple modules, separate them with `,` like `[megatron, fsdp, doc]` - `{type}` is in `feat`, `fix`, `refactor`, `chore`, `test` - If this PR breaks any API (CLI arguments, config, function signature, etc.), add `[BREAKING]` to the beginning of the title. - Example: `[BREAKING][fsdp, megatron] feat: dynamic batching` ### Test > For changes that can not be tested by CI (e.g., algorithm implementation, new model support), validate by experiment(s) and show results like training curve plots, evaluation results, etc. ### API and Usage Example > Demonstrate how the API changes if any, and provide usage example(s) if possible. ```python # Add code snippet or script demonstrating how to use this ``` ### Design & Code Changes > Demonstrate the high-level design if this PR is complex, and list the specific changes. ### Checklist Before Submitting > [!IMPORTANT] > Please check all the following items before requesting a review, otherwise the reviewer might deprioritize this PR for review. - [x] Read the [Contribute Guide](https://github.com/volcengine/verl/blob/main/CONTRIBUTING.md). - [x] Apply [pre-commit checks](https://github.com/volcengine/verl/blob/main/CONTRIBUTING.md#code-linting-and-formatting): `pre-commit install && pre-commit run --all-files --show-diff-on-failure --color=always` - [x] Add / Update [the documentation](https://github.com/volcengine/verl/tree/main/docs). Not applicable. - [ ] Add unit or end-to-end test(s) to [the CI workflow](https://github.com/volcengine/verl/tree/main/.github/workflows) to cover all the code. If not feasible, explain why: ... - [ ] Once your PR is ready for CI, send a message in [the `ci-request` channel](https://verl-project.slack.com/archives/C091TCESWB1) in [the `verl` Slack workspace](https://join.slack.com/t/verl-project/shared_invite/zt-3855yhg8g-CTkqXu~hKojPCmo7k_yXTQ). (If not accessible, please try [the Feishu group (飞书群)](https://applink.larkoffice.com/client/chat/chatter/add_by_link?link_token=772jd4f1-cd91-441e-a820-498c6614126a).)

m-Just · 2025-10-31T13:29:21Z

Hi, for those who find that the error persists, please note that this PR only fixes the shape mismatch issue for agent_loop. If you are using other kinds of rollout, then you probably should look for another fix.

### What does this PR do? > Fix qwen2_vl position_id shape mismatch: `verl/models/transformers/qwen2_vl.py:process_position_ids` expects `position_ids` to have a shape of `(4, batch_size, seq_length)` but `verl/experimental/agent_loop/agent_loop.py:generate_sequences` returns `(batch_size, 3, seq_length)` (which will be transposed to `(3, batch_size, seq_length)`), ignoring the text dimension. This PR follows the relevant code in `verl/utils/dataset/rl_dataset.py` to fix the issue. ### Checklist Before Starting - [x] Search for similar PRs. Paste at least one query link here: ... - [x] Format the PR title as `[{modules}] {type}: {description}` (This will be checked by the CI) - `{modules}` include `fsdp`, `megatron`, `sglang`, `vllm`, `rollout`, `trainer`, `ci`, `training_utils`, `recipe`, `hardware`, `deployment`, `ray`, `worker`, `single_controller`, `misc`, `perf`, `model`, `algo`, `env`, `tool`, `ckpt`, `doc`, `data` - If this PR involves multiple modules, separate them with `,` like `[megatron, fsdp, doc]` - `{type}` is in `feat`, `fix`, `refactor`, `chore`, `test` - If this PR breaks any API (CLI arguments, config, function signature, etc.), add `[BREAKING]` to the beginning of the title. - Example: `[BREAKING][fsdp, megatron] feat: dynamic batching` ### Test > For changes that can not be tested by CI (e.g., algorithm implementation, new model support), validate by experiment(s) and show results like training curve plots, evaluation results, etc. ### API and Usage Example > Demonstrate how the API changes if any, and provide usage example(s) if possible. ```python # Add code snippet or script demonstrating how to use this ``` ### Design & Code Changes > Demonstrate the high-level design if this PR is complex, and list the specific changes. ### Checklist Before Submitting > [!IMPORTANT] > Please check all the following items before requesting a review, otherwise the reviewer might deprioritize this PR for review. - [x] Read the [Contribute Guide](https://github.com/volcengine/verl/blob/main/CONTRIBUTING.md). - [x] Apply [pre-commit checks](https://github.com/volcengine/verl/blob/main/CONTRIBUTING.md#code-linting-and-formatting): `pre-commit install && pre-commit run --all-files --show-diff-on-failure --color=always` - [x] Add / Update [the documentation](https://github.com/volcengine/verl/tree/main/docs). Not applicable. - [ ] Add unit or end-to-end test(s) to [the CI workflow](https://github.com/volcengine/verl/tree/main/.github/workflows) to cover all the code. If not feasible, explain why: ... - [ ] Once your PR is ready for CI, send a message in [the `ci-request` channel](https://verl-project.slack.com/archives/C091TCESWB1) in [the `verl` Slack workspace](https://join.slack.com/t/verl-project/shared_invite/zt-3855yhg8g-CTkqXu~hKojPCmo7k_yXTQ). (If not accessible, please try [the Feishu group (飞书群)](https://applink.larkoffice.com/client/chat/chatter/add_by_link?link_token=772jd4f1-cd91-441e-a820-498c6614126a).)

fix qwen2_vl position_ids bug

949ae4e

gemini-code-assist bot reviewed Sep 30, 2025

View reviewed changes

verl/experimental/agent_loop/agent_loop.py Show resolved Hide resolved

0001Henry mentioned this pull request Oct 1, 2025

Cannot run geo3k multiturn example #3647

Open

4 tasks

hiyouga approved these changes Oct 5, 2025

View reviewed changes

hiyouga merged commit 327e813 into volcengine:main Oct 5, 2025
36 of 65 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[rollout] fix: qwen2_vl position_ids shape mismatch #3653

[rollout] fix: qwen2_vl position_ids shape mismatch #3653

Uh oh!

m-Just commented Sep 30, 2025 •

edited

Loading

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

hiyouga left a comment

Uh oh!

Uh oh!

huaiyizhao commented Oct 10, 2025

Uh oh!

zhanxuejie commented Oct 16, 2025 •

edited

Loading

Uh oh!

zhanxuejie commented Oct 16, 2025

Uh oh!

m-Just commented Oct 31, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

[rollout] fix: qwen2_vl position_ids shape mismatch #3653

[rollout] fix: qwen2_vl position_ids shape mismatch #3653

Uh oh!

Conversation

m-Just commented Sep 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Checklist Before Starting

Test

API and Usage Example

Design & Code Changes

Checklist Before Submitting

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

hiyouga left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

huaiyizhao commented Oct 10, 2025

Uh oh!

zhanxuejie commented Oct 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

zhanxuejie commented Oct 16, 2025

Uh oh!

m-Just commented Oct 31, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

m-Just commented Sep 30, 2025 •

edited

Loading

zhanxuejie commented Oct 16, 2025 •

edited

Loading