-
Notifications
You must be signed in to change notification settings - Fork 175
(sync): sync to GitHub 0924. #172
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
+14,294
−5,746
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
…ass and causing a hang during token mean.
…work and incorrect logits shape under `megatron_strategy`.
1ee16b3 to
58d97ca
Compare
Closed
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
(feat): refine req.
(fix): fix sglang logprobs.
(docs): update readme AIGB-Pearl.
(feat): support wan2_2 reward fl pipeline.
(docs): add docs.
fix: fix vllm version compare
(refactor): delete webshop async yaml.
(fix): fix webshop state bug.
(feat): log by traj.
(feat): add env_step_limiter for create env.
(feat): env_worker initialize.
(refactor): refine action pattern.
(refactor): refactor agentic modules.
(refactor): refine env_manager.
(refactor): adjust env.
(feat): add step reinforce.
(fix)Fixed the issue where the distill_on_prompt parameter did not …
(fix) pass both custom and vllm env vars to RayWorkerWrapper
fix: qwen3next save ckpt
(feat): group size redundancy.
(fix) fix vllm cache root interference
feat(models): add qwen3 next model implementation
(feat): tir qa + search and math + python.
(fix): fix stop_strings type.
(feat): add compute_conversation_end_token_id.
(fix): fix dataset load lock error.
(fix): aggregate_metrics value.
(fix): fix math_env exception.
fix issue that ROLL may hang in colocate mode when running on PPU.
Fix typo
(feat) Dockerfile torch280.
(feat) vllm 0.10.2 (qwen3-next).
(feat): support sglang 052.
(feat): update convert script.
(feat): refine entropy compute.
(feat): roll debug flag for gpu memory metrics.
(fix): add transformers version check.
(feat): update mcore 0.13.
(deprecate): offline torch251/vllm073/sglang043.
(fix): fix include_stop_str_in_output.
(chore): update to pytorch260 and fix norm_mean_type in yaml.
(feat): support sglang 0.4.10.post2.
(feat): add stop string & set env_manager skip_special_tokens=False.
feat: support use_remove_padding for megatron strategy to trim tailin…
(fix): fix adjust_batch.
fix: incorrectly handled dim=None, breaking torch autograd backward p…
(feat): add env tool wrapper.
(feat): support vllm dynamic fp8.
(feat): lite_ppo add div_std_type.
(fix): set loss_agg_mode to seq-mean-token-mean.
(fix): clean env.
feat: add sft pipeline.
(chore): set gem version.
[perf]: llm judge reward worker Strategy HF -> vllm.
(refactor): refactor env manager to gEm.
(fix): fix is_use_additional_prompts name for val.
(feat): refine dataset for rlvr_vlm_pipeline.
refactor: add is_lora param to broadcast_parameter method.
fix: convert to hf.
(fix): add is_lora param to broadcast_parameter method.