Releases · hiyouga/EasyR1 · GitHub

18 Sep 09:25

hiyouga

v0.3.2: RL Baselines Latest

Latest

What's Changed

[misc] set dev version by @hiyouga in #372
fix typo in tensorboard logger by @ning-mz in #373
[utils] fix log probs by @hiyouga in #375
Supports video input by @robinjoe93 in #376
[data] fix video data by @hiyouga in #386
[data] add filter worker by @hiyouga in #387
[example] update qwen3_14b_dapo17k_dapo.sh by @Saigyouji-Yuyuko1000 in #389
[misc] fix ci by @hiyouga in #391
[ckpt] fix remove logic by @hiyouga in #393
[example] update config by @hiyouga in #394
[worker] fix typo in FSDPWorker by @jasper0314-huang in #396
[misc] fix flops counter by @hiyouga in #401
[readme] update wechat by @hiyouga in #404
[readme] add our work to the readme by @JingchengYang4 in #408
[example] change dapo verify by @Saigyouji-Yuyuko1000 in #407
[reward] fix dapo verifier by @hiyouga in #410
Update README.md to include Long-RL by @yukang2017 in #411
[worker] add dynamic batching by @hiyouga in #417
[readme] update wechat by @hiyouga in #418
[worker] fix dp tokens by @hiyouga in #419
[examples] fix config by @hiyouga in #420
[misc] fix ops by @hiyouga in #421
[worker] fix fsdp worker by @hiyouga in #422
[worker] fix grad norm by @hiyouga in #423
[data] better mm data collate by @hiyouga in #424
[trainer] support auto resume by @hiyouga in #425
[worker] add dynamic batching computational workload balance by @hiyouga in #426
[readme] update usage of apptainer by @yzoaim in #434
[readme] update wechat by @hiyouga in #442
[protocol] non blocking false by default by @hiyouga in #445
[readme] update wechat by @hiyouga in #447
[feat] support ray.timeline by @yzoaim in #449
[docker] upgrade vllm to 0.10 by @hiyouga in #453
[worker] fix multi modal data oom by @hiyouga in #454
[misc] fix data proto by @hiyouga in #458
[readme] update wechat by @hiyouga in #461
[trainer] fix checkpoint tracker by @hiyouga in #467
[patch] fix fa utils by @hiyouga in #472
[misc] fix fa patch by @hiyouga in #473
[misc] fix model merger by @hiyouga in #479
[misc] lint by @hiyouga in #480
Fix valset loading for videos by @zhuohaoyu in #482
[readme] update wechat by @hiyouga in #486
[bugfix] fix position ids for latest transformers by @hiyouga in #494
[readme] update wechat by @hiyouga in #495
[misc] pin transformers to 4.56.1 by @hiyouga in #496
[deps] upgrade transformers to 4.54 by @hiyouga in #501
[release] v0.3.2 by @hiyouga in #502

New Contributors

@ning-mz made their first contribution in #373
@robinjoe93 made their first contribution in #376
@jasper0314-huang made their first contribution in #396
@JingchengYang4 made their first contribution in #408
@yukang2017 made their first contribution in #411
@yzoaim made their first contribution in #434
@zhuohaoyu made their first contribution in #482

Full Changelog: v0.3.1...v0.3.2

Contributors

zhuohaoyu, hiyouga, and 7 other contributors

Assets 2

19 Jun 06:57

hiyouga

v0.3.1: Multi-modal DAPO

What's Changed

[example] fix runtime env by @hiyouga in #224
Update Awesome Work using EasyR1 by @RainBowLuoCS in #240
Update Awesome Work using EasyR1 by @xyliugo in #239
[trainer] support async reward by @hiyouga in #252
[readme] add baselines by @hiyouga in #253
[script] fix merge script by @hiyouga in #254
[misc] update baselines & docker image by @hiyouga in #256
[readme] update baseline by @hiyouga in #258
[data] support custom chat template by @hiyouga in #270
[reward] support batch reward by @hiyouga in #271
[example] change env vars by @hiyouga in #272
[Readme] Add awesome work using EasyR1 by @Wangbiao2 in #273
[model] add qwen3 support by @hiyouga in #276
[example] update script by @hiyouga in #277
[readme] update wechat by @hiyouga in #280
[misc] fix logger by @hiyouga in #288
[readme] update wechat by @hiyouga in #292
add get model from modelscope by @Saigyouji-Yuyuko1000 in #297
[readme] update wechat by @hiyouga in #301
Add a new work based on EasyR1 by @LiuRicky in #303
add new work based on EasyR1 by @waltonfuture in #313
[logger] fix tensorboard by @hiyouga in #316
Add a new work based on EasyR1 by @Gabesarch in #325
[misc] fix console hanging by @hiyouga in #293
[misc] several update by @hiyouga in #329
Update README.md by @CSfufu in #330
[perf] pass raw image data between workers by @tongxiao2002 in #318
[readme] add our work using EasyR1 by @kxfan2002 in #331
add our work using EasyR1 by @YutingLi0606 in #337
[data] fix position ids for qwen2vl mrope & add test by @hiyouga in #339
[worker] colocate actor and ref model by @hiyouga in #342
[trainer] save best checkpoint by @hiyouga in #343
[trainer] fix bug by @hiyouga in #344
[utils] update data protocol by @hiyouga in #345
[trainer] repeat rollout and prepare filter by @hiyouga in #346
[worker] expose rollout manager by @hiyouga in #347
[worker] fix vllm sharding manager by @hiyouga in #348
fix: bug by @gdw439 in #350
[trainer] fix progress bar by @hiyouga in #355
[readme] update docker image by @hiyouga in #357
[trainer] add online filtering by @Saigyouji-Yuyuko1000 in #358
[worker] update reward manager by @hiyouga in #360
Fix/vllm processor cache for text only model by @cyc00518 in #359
[breaking] support text-image mixed data by @hiyouga in #361
[model] fix qwen2vl bug by @hiyouga in #363
[tracking] add tensorboard exp name by @hiyouga in #365
[worker] do not load ref if kl is disabled by @hiyouga in #366
[worker] fix skip ref model by @hiyouga in #367
[examples] add qwen3_14b_dapo17k_dapo by @Saigyouji-Yuyuko1000 in #369
[release] 0.3.1 by @hiyouga in #370

New Contributors

@RainBowLuoCS made their first contribution in #240
@xyliugo made their first contribution in #239
@Saigyouji-Yuyuko1000 made their first contribution in #297
@waltonfuture made their first contribution in #313
@Gabesarch made their first contribution in #325
@CSfufu made their first contribution in #330
@tongxiao2002 made their first contribution in #318
@kxfan2002 made their first contribution in #331
@YutingLi0606 made their first contribution in #337
@gdw439 made their first contribution in #350
@cyc00518 made their first contribution in #359

Full Changelog: v0.3.0...v0.3.1

Contributors

Gabesarch, hiyouga, and 12 other contributors

Assets 2

15 Apr 11:40

hiyouga

v0.3.0: Initial release

What's Changed

update readme by @hiyouga in #4
[readme] update readme by @hiyouga in #5
[worker] fix small models by @hiyouga in #14
feat: swanlab examples by @Zeyi-Lin in #13
[example] add ReMax support by @Shenzhi-Wang in #20
fix:vllm length by @AL-377 in #18
fix: math reward fn by @yueyang130 in #26
[readme] update readme by @hiyouga in #29
Fix template issue by @wzq016 in #31
[example] fix length by @hiyouga in #32
[readme] update hardware requirement by @hiyouga in #33
[worker] fix model attn init by @hiyouga in #37
Witness the Aha Moment on Counting Task by @BUAADreamer in #38
[example] fix clevr example by @hiyouga in #47
Fix: save processor for VLMs by @wzq016 in #48
[perf] support padding-free training for VLMs by @hiyouga in #61
[readme] update readme by @hiyouga in #62
[readme] add fig explain by @hiyouga in #64
[readme] update fig by @hiyouga in #65
[trainer] support resume ckpt by @hiyouga in #66
[config] update default config by @hiyouga in #68
[readme] update wechat by @hiyouga in #71
[env] fix memory leak & enable vLLM v1 by @hiyouga in #73
[readme] update readme by @hiyouga in #75
[readme] update readme by @hiyouga in #80
Add new baseline GeoQA8k from R1V by @chenllliang in #86
[feat] support freeze vision tower by @hiyouga in #99
[config] increase prompt length by @hiyouga in #100
update readme - add ## Awesome Work using EasyR1 by @LengSicong in #101
Add the work Vision-R1 that uses EasyR1 by @Osilly in #102
fix:OOM by @dirtyDan0 in #111
[trainer] verify arg by @hiyouga in #112
[misc] sync feat from upstream by @hiyouga in #113
[misc] clean some code by @hiyouga in #114
[example] add examples by @hiyouga in #118
[checkpoint] fix load checkpoint by @hiyouga in #119
[trainer] gather metrics by @hiyouga in #120
[misc] add doc string by @hiyouga in #121
Add seg zero to README by @LiuRicky in #122
Update README.md by @PzySeere in #124
fix readme by @hiyouga in #127
[core] remove entropy loss by @hiyouga in #132
[trainer] support val sampling by @hiyouga in #133
misc: save at the last step by @dirtyDan0 in #138
feat: swanlab add easyr1 and verl config by @Zeyi-Lin in #140
[version] upgrade vllm to 0.8 by @hiyouga in #143
[readme] update docker file by @hiyouga in #146
[readme] update wechat by @hiyouga in #147
[readme] update dockerfile by @hiyouga in #148
Update requirements.txt for multinode by @chenllliang in #154
[trainer] support channel-wise reward by @hiyouga in #155
Update README.md by @PzySeere in #157
[trainer] support save limit & fix oom issue by @hiyouga in #158
[misc] update docker files by @hiyouga in #162
[trainer] support 32b by @hiyouga in #164
[data] use hf-native template by @hiyouga in #165
[misc] fix dataset by @hiyouga in #166
[readme] update tutorial by @hiyouga in #167
[tracking] add tensorboard by @hiyouga in #170
[misc] support adamw bf16 by @hiyouga in #171
[misc] fix config by @hiyouga in #172
[misc] fix metrics by @hiyouga in #173
[misc] refactor val gen log by @hiyouga in #174
update Awesome Work using EasyR1 by @appletea233 in #179
[misc] fix masked mean by @hiyouga in #181
[misc] algo improvement by @hiyouga in #184
[misc] minor update by @hiyouga in #188
[fix] arg check by @hiyouga in #189
[bugfix] fix vllm 0.8.3 rollout by @hiyouga in #197
[deps] upgrade to vllm 0.8.3 by @hiyouga in #202
[core] separate score fn & vllm logit bias by @hiyouga in #204
Supports loading format prompt from a file by @Wangbiao2 in #208
[data] update data configs by @hiyouga in #214
fix: enable user to filter overlong examples in RLHFDataset by @0x404 in #210
[data] fix rl dataset by @hiyouga in #215
[misc] lint by @hiyouga in #216
[data] add multi image dataset by @hiyouga in #217
[readme] add multi node script by @hiyouga in #218
[torch] fix saving bf16 optimizer by @hiyouga in #221
[version] release 0.3.0 by @hiyouga in #222

New Contributors

@Zeyi-Lin made their first contribution in #13
@AL-377 made their first contribution in #18
@yueyang130 made their first contribution in #26
@wzq016 made their first contribution in #31
@BUAADreamer made their first contribution in #38
@chenllliang made their first contribution in #86
@LengSicong made their first contribution in #101
@Osilly made their first contribution in #102
@dirtyDan0 made their first contribution in #111
@LiuRicky made their first contribution in #122
@PzySeere made their first contribution in #124
@appletea233 made their first contribution in #179
@Wangbiao2 made their first contribution in #208
@0x404 made their first contribution in #210

Full Changelog: https://github.com/hiyouga/EasyR1/commits/v0.3.0

Contributors

hiyouga, 0x404, and 14 other contributors

Assets 2