Hi! In our experience, vllm rollouts use 70% of the grpo iteration time (when performed in bf16)
Has anyone tried using more aggressive precision reduction for rollouts (like using FP8) for gaining speed? I wonder if some okay methods exist for online FP8 usage for gaining speed in this rollout phase
Thanks!
Hi! In our experience, vllm rollouts use 70% of the grpo iteration time (when performed in bf16)
Has anyone tried using more aggressive precision reduction for rollouts (like using FP8) for gaining speed? I wonder if some okay methods exist for online FP8 usage for gaining speed in this rollout phase
Thanks!