Skip to content

Conversation

@PanAndy
Copy link
Collaborator

@PanAndy PanAndy commented Sep 24, 2025

(feat): refine req.
(fix): fix sglang logprobs.
(docs): update readme AIGB-Pearl.
(feat): support wan2_2 reward fl pipeline.
(docs): add docs.
fix: fix vllm version compare
(refactor): delete webshop async yaml.
(fix): fix webshop state bug.
(feat): log by traj.
(feat): add env_step_limiter for create env.
(feat): env_worker initialize.
(refactor): refine action pattern.
(refactor): refactor agentic modules.
(refactor): refine env_manager.
(refactor): adjust env.
(feat): add step reinforce.
(fix)Fixed the issue where the distill_on_prompt parameter did not …
(fix) pass both custom and vllm env vars to RayWorkerWrapper
fix: qwen3next save ckpt
(feat): group size redundancy.
(fix) fix vllm cache root interference
feat(models): add qwen3 next model implementation
(feat): tir qa + search and math + python.
(fix): fix stop_strings type.
(feat): add compute_conversation_end_token_id.
(fix): fix dataset load lock error.
(fix): aggregate_metrics value.
(fix): fix math_env exception.
fix issue that ROLL may hang in colocate mode when running on PPU.
Fix typo
(feat) Dockerfile torch280.
(feat) vllm 0.10.2 (qwen3-next).
(feat): support sglang 052.
(feat): update convert script.
(feat): refine entropy compute.
(feat): roll debug flag for gpu memory metrics.
(fix): add transformers version check.
(feat): update mcore 0.13.
(deprecate): offline torch251/vllm073/sglang043.
(fix): fix include_stop_str_in_output.
(chore): update to pytorch260 and fix norm_mean_type in yaml.
(feat): support sglang 0.4.10.post2.
(feat): add stop string & set env_manager skip_special_tokens=False.
feat: support use_remove_padding for megatron strategy to trim tailin…
(fix): fix adjust_batch.
fix: incorrectly handled dim=None, breaking torch autograd backward p…
(feat): add env tool wrapper.
(feat): support vllm dynamic fp8.
(feat): lite_ppo add div_std_type.
(fix): set loss_agg_mode to seq-mean-token-mean.
(fix): clean env.
feat: add sft pipeline.
(chore): set gem version.
[perf]: llm judge reward worker Strategy HF -> vllm.
(refactor): refactor env manager to gEm.
(fix): fix is_use_additional_prompts name for val.
(feat): refine dataset for rlvr_vlm_pipeline.
refactor: add is_lora param to broadcast_parameter method.
fix: convert to hf.
(fix): add is_lora param to broadcast_parameter method.

chocoded and others added 30 commits September 24, 2025 16:49
@PanAndy PanAndy force-pushed the sync/sync_to_github_0924 branch from 1ee16b3 to 58d97ca Compare September 25, 2025 03:56
@PanAndy PanAndy merged commit d6ef293 into main Sep 25, 2025
6 checks passed
@PanAndy PanAndy deleted the sync/sync_to_github_0924 branch September 25, 2025 04:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.