Skip to content

Conversation

@ISEEKYAN
Copy link
Collaborator

@ISEEKYAN ISEEKYAN commented Aug 6, 2025

What does this PR do?

update example 671B script, no offline dist-ckpt needed any more

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request updates an example script for training a 671B parameter model, removing the need for an offline distributed checkpoint. The changes simplify the setup process by leveraging mbridge to initialize the model directly from Hugging Face checkpoints. My review focuses on improving the script's portability and clarity. I've identified a hardcoded path that breaks portability and a redundant calculation that adds confusion. Addressing these points will make the example script more robust and easier to use for the community.

max_prompt_length=2048
max_response_length=4096
use_dynamic_bsz=True
actor_ppo_max_token_len=$(((max_prompt_length + max_response_length) * 10 / 10))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The calculation * 10 / 10 is redundant as it's equivalent to multiplying by 1. This is confusing and should be removed for clarity. If a different ratio was intended, please correct it.

Suggested change
actor_ppo_max_token_len=$(((max_prompt_length + max_response_length) * 10 / 10))
actor_ppo_max_token_len=$((max_prompt_length + max_response_length))

@ETOgaosion ETOgaosion merged commit aebc51a into volcengine:main Aug 6, 2025
4 checks passed
HaeChan0305 pushed a commit to HaeChan0305/MLILAB-GRPO that referenced this pull request Aug 8, 2025
…eded any more (volcengine#2945)

### What does this PR do?

update example 671B script, no offline dist-ckpt needed any more
yellowbee686 pushed a commit to yellowbee686/verl that referenced this pull request Aug 11, 2025
…eded any more (volcengine#2945)

### What does this PR do?

update example 671B script, no offline dist-ckpt needed any more
whatadayG pushed a commit to whatadayG/verl that referenced this pull request Sep 5, 2025
…eded any more (volcengine#2945)

### What does this PR do?

update example 671B script, no offline dist-ckpt needed any more
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants