Commit 1e41334
authored
[recipe] feat: Add InfiGUI-G1 recipe for MLLM GUI grounding (#3242)
### What does this PR do?
This PR introduces a new recipe, `infigui-g1`, for training Multimodal
Large Language Models (MLLMs) in GUI grounding tasks. This recipe
implements a reinforcement learning approach that significantly improves
the model's ability to understand and interact with graphical user
interfaces.
### Checklist Before Starting
- [x] Search for similar PRs. Paste at least one query link here:
https://github.com/search?q=repo%3Avolcengine%2Fverl+gui&type=pullrequests
- [x] Format the PR title as `[{modules}] {type}: {description}` (This
will be checked by the CI)
- `{modules}` include `fsdp`, `megatron`, `sglang`, `vllm`, `rollout`,
`trainer`, `ci`, `training_utils`, `recipe`, `hardware`, `deployment`,
`ray`, `worker`, `single_controller`, `misc`, `perf`, `model`, `algo`,
`env`, `tool`, `ckpt`, `doc`, `data`
- If this PR involves multiple modules, separate them with `,` like
`[megatron, fsdp, doc]`
- `{type}` is in `feat`, `fix`, `refactor`, `chore`, `test`
- If this PR breaks any API (CLI arguments, config, function signature,
etc.), add `[BREAKING]` to the beginning of the title.
- Example: `[BREAKING][fsdp, megatron] feat: dynamic batching`
### Test
The effectiveness of this recipe has been validated through experiments.
Key results are as follows:
- The training curves for reward, validation accuracy, and exploration
success rate all show a upward trend.
- After 156 steps of training on sample data, the 3b model achieves a
score of **41.2** on the `screenspot-pro` benchmark, a substantial
improvement over the base model's score of **18.2**.
<img width="345" height="291" alt="Screenshot 2025-08-27 172010"
src="https://github.com/user-attachments/assets/9ecd93d5-4f9b-4c40-831c-79a50fd197c4"
/>
<img width="347" height="292" alt="Screenshot 2025-08-27 171902"
src="https://github.com/user-attachments/assets/2e437c1f-9eb0-4106-a6c3-b22125026a79"
/>
<img width="346" height="293" alt="Screenshot 2025-08-27 171928"
src="https://github.com/user-attachments/assets/9c94515d-1501-40f4-979c-95e2f819dc62"
/>
### API and Usage Example
The recipe is self-contained and can be run using the provided scripts.
For example, to run training with the 3B parameter model:
```bash
# In verl path
bash recipe/infigui-g1/run_3b.sh
```
### Design & Code Changes
This PR adds a new, independent recipe located in `recipe/infigui-g1/`.
The changes are fully encapsulated within this directory and do not
affect any other part of the codebase.
The new files include:
- `recipe/infigui-g1/README.md`: An introduction to the recipe.
- `recipe/infigui-g1/run_3b.sh`, `run_7b.sh`: Scripts to launch
training.
- `recipe/infigui-g1/reward_fn.py`: Custom reward function
implementation.
### Checklist Before Submitting
> [!IMPORTANT]
> Please check all the following items before requesting a review,
otherwise the reviewer might deprioritize this PR for review.
- [x] Read the [Contribute
Guide](https://github.com/volcengine/verl/blob/main/CONTRIBUTING.md).
- [x] Apply [pre-commit
checks](https://github.com/volcengine/verl/blob/main/CONTRIBUTING.md#code-linting-and-formatting):
`pre-commit install && pre-commit run --all-files --show-diff-on-failure
--color=always`
- [ ] Add / Update [the
documentation](https://github.com/volcengine/verl/tree/main/docs).
- [ ] Add unit or end-to-end test(s) to [the CI
workflow](https://github.com/volcengine/verl/tree/main/.github/workflows)
to cover all the code. If not feasible, explain why: ...
- [ ] Once your PR is ready for CI, send a message in [the `ci-request`
channel](https://verl-project.slack.com/archives/C091TCESWB1) in [the
`verl` Slack
workspace](https://join.slack.com/t/verl-project/shared_invite/zt-3855yhg8g-CTkqXu~hKojPCmo7k_yXTQ).
(If not accessible, please try [the Feishu group
(飞书群)](https://applink.larkoffice.com/client/chat/chatter/add_by_link?link_token=772jd4f1-cd91-441e-a820-498c6614126a).)1 parent 53b68c6 commit 1e41334
4 files changed
+554
-0
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
0 commit comments