You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The most important thing for verl Q3 is to make it a modular foundational library for the community to extend, as a starting point but not the destination.
composable model engines
Finish up #1560 such that parallelism strategy is not implemented at the engine level, without exposing details to the worker(role) level. The fsdp/megatron engines are expected to be created and run in a standalone fashion, and be reused across different roles.
fsdp actor, critic, ref (focus on fsdp2)
megatron actor, critic, ref
torchtitan integration (call for contribution)
switch all recipe/examples from fsdp1 to fsdp2 by default (and remove ill-maintained ones)
performance tuning, and reference throughput benchmark across [model type, model size, seqlen, hardware, num accelerators, worker role] to achieve better disaggregated resource allocation
fully-async pipeline
multi-turn, data, config infra
better message infra for multi-turn messages, dense reward @SwordFaith
better abstraction and registration system for multi-modal models. Currently different multi-modals have inconsistent config attr (e.g. rope), freeze/unfreeze setup, input/output processing... (ideally this should be done at huggingface transformers level but it's not sufficient right now cc @NielsRogge) (RFC needed)
verl needs a documentation page about the latest status of model support and per model related features (lora, sequence parallelism, megatron, etc)
high quality recipes and end2end optimizations
retool recipe (code is ready, going through reviews)
SOTA multimodal vlm RL recipe (call for contribution)
enhance DAPO recipe with larger models, and provide scripts with high training throughput (many perf knobs are not turned on in the current script)
Past roadmap dicusssions for reference: #710 #22
The most important thing for verl Q3 is to make it a modular foundational library for the community to extend, as a starting point but not the destination.
composable model engines
Finish up #1560 such that parallelism strategy is not implemented at the engine level, without exposing details to the worker(role) level. The fsdp/megatron engines are expected to be created and run in a standalone fashion, and be reused across different roles.
Work in progress interface for comments #1977
rollout workers
Additional ongoing efforts:
async & disaggregated architecture
multi-turn, data, config infra
streamline new model workflow
high quality recipes and end2end optimizations
Additional existing ongoing features: