Skip to content

Conversation

@frrad
Copy link
Contributor

@frrad frrad commented Jul 18, 2025

What does this PR do?

Adds mypy typechecking to pre-commit. Only one module is whitelisted and it is configured to be very permissive.

Test

this is a CI / testing change

API and Usage Example

Demonstrate how the API changes if any, and provide usage example(s) if possible.

# Add code snippet or script demonstrating how to use this

Design & Code Changes

Demonstrate the high-level design if this PR is complex, and list the specific changes.

Checklist Before Submitting

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces mypy for static type checking, initially enabling it for the verl.trainer.ppo.core_algos module. The changes include adding the mypy pre-commit hook, configuring it in pyproject.toml, and fixing type errors in the target module. My review focuses on a critical issue with the pre-commit configuration that will break the CI, and a high-severity issue regarding the use of typing.Any which undermines the goal of adding static type analysis. Addressing these will ensure the CI works as expected and improve the long-term maintainability of the typed code.

@zhaochenyang20
Copy link
Collaborator

@frrad CI is good. I can help to merge if haibin approve.

@frrad
Copy link
Contributor Author

frrad commented Jul 23, 2025

copying comment here from Lark

@eric-haibin-lin

i wonder why the type is corrected to "Any". in fact we pass DictConfig or AlgoConfig.
Does the mypy make the correction itself, or the change has to be done manually?

@frrad

I made that change manually. the AlgoConfig type is incorrect - none of the fields referenced on L939

kl_cov_ratio = config.policy_loss.kl_cov_ratio if config.policy_loss.kl_cov_ratio is not None else 0.0002
actually exist on the datatype
gamma: float = 1.0
lam: float = 1.0
adv_estimator: str = "gae"
norm_adv_by_std_in_grpo: bool = True
use_kl_in_reward: bool = False
kl_penalty: str = "kl"
kl_ctrl: KLControlConfig = field(default_factory=KLControlConfig)
use_pf_ppo: bool = False
pf_ppo: Optional[PFPPOConfig] = None
filter_groups: Optional[FilterGroupsConfig] = None

I was not sure what the real type was so I changed it to Any. I can change it to DictConfig instead if you prefer.

@eric-haibin-lin
Copy link
Collaborator

Let's change it to Optional AlgoConfig|DictConfig]

@frrad
Copy link
Contributor Author

frrad commented Jul 23, 2025

please take another look

@frrad
Copy link
Contributor Author

frrad commented Jul 23, 2025

In cc2972c I have added an additional path to the mypy whitelist which make it necessary to change AlgoConfig to add the fields referenced from core_algos. Happy to revert that commit to keep the scope of this first PR as small as possible if you want.

edit: originally I had linked the wrong commit in this message

@zhaochenyang20
Copy link
Collaborator

I think this PR is almost done. Let me help on the merge! @frrad

@eric-haibin-lin eric-haibin-lin self-assigned this Jul 24, 2025
policy_loss: Optional[PolicyLossConfig] = field(default_factory=PolicyLossConfig)
clip_ratio_high: float = 0.2
clip_ratio_low: float = 0.2
clip_ratio: float = 0.2
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are two issues. One is that verl already has actor_rollout_ref.actor.policy_loss and actor_rollout_ref.actor.clip_ratio_high/low, duplicating them without changing existing example script would be confusing.
Also restructuring algo config & actor config is not a high priority for now and moving them around and breaking others' recipes may lead to bad experience. If we do want to make breaking changes, we usually make announcement for deprecations, keep it for one version before breaking it..

I do not think it's necessary to couple mypy setup with breaking changes ..

@frrad
Copy link
Contributor Author

frrad commented Jul 24, 2025

I have reverted that commit.

@frrad
Copy link
Contributor Author

frrad commented Jul 24, 2025

this is probably a better fix. please take another look.

@zhaochenyang20
Copy link
Collaborator

@frrad Let me merge it after the CI

@zhaochenyang20 zhaochenyang20 merged commit f407887 into volcengine:main Jul 25, 2025
52 of 53 checks passed
oseyosey pushed a commit to oseyosey/verl that referenced this pull request Jul 28, 2025
CurryRice233 added a commit to CurryRice233/verl that referenced this pull request Jul 28, 2025
* origin/mindspeed: (39 commits)
  [perf] feat: add optional role selection in discrete mode for NPU Profiler (volcengine#2750)
  [rollout] feat: remove chat scheduler (volcengine#2725)
  [trainer] refactor: Make sure to keep the type checking (volcengine#2634)
  [doc] style: change resize handle from gradient to plain color (volcengine#2746)
  [CI] feat: add `mypy` to pre-commit (volcengine#2614)
  [megatron] feat: a bunch of optimzation on vram, sequence packing (volcengine#2678)
  [docker] feat: upgrade to torch 2.7, sglang 0.4.8 (volcengine#2617)
  [doc] feat: add resizable sidebar and improve layout (volcengine#2577)
  [ci] fix: release ascend test time, fix one step off-policy CI (volcengine#2731)
  [recipe] chore: add retool training script (volcengine#2732)
  [ci] fix: checkpoint_convertor ci miss a hf model download (volcengine#2730)
  [doc] feat: Add agent-lightning in the list of "awesome works using verl (volcengine#2726)
  [tool] fix: geo3k create return str instead of tuple (volcengine#2714)
  [megatron] fix: resolve backward propagation error in megatron_actor due to shared logits tensor in-place modification (volcengine#2484)
  [misc] chore: bump main branch version to v0.5.0.dev (volcengine#2718)
  [sglang] fix: Adding strict naming sanity for sglang (volcengine#2719)
  [ray] feat: RayWorkerGroup support set worker env (volcengine#2685)
  [ci] test: add CriticWorker unit test, make some util CPU friendly (volcengine#2717)
  [cfg] refactor: add ActorConfig, EngineConfig, and ActorWorker unit test, refactor validation code (volcengine#2621)
  [misc] chore: bump version to v0.5.0 (volcengine#2716)
  ...
SumanthRH pushed a commit to SumanthRH/verl that referenced this pull request Jul 29, 2025
yellowbee686 pushed a commit to yellowbee686/verl that referenced this pull request Jul 31, 2025
Juniper1021 pushed a commit to Juniper1021/verl that referenced this pull request Aug 7, 2025
whatadayG pushed a commit to whatadayG/verl that referenced this pull request Sep 5, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants