Skip to content

Conversation

@LiuRicky
Copy link
Contributor

@LiuRicky LiuRicky commented Feb 18, 2025

#109

Also solve the problem when using deepspeed zero3 training, as shown in huggingface/transformers@8ee5053

After update these changes, one can add "--deepspeed r1-v/local_scripts/zero3.json " in the training script when using deepspeed.

@LiuRicky LiuRicky changed the title Support qwen2.5-VL in sft.py Support qwen2.5-VL in sft.py and solve GRPO deepspeed training issue Feb 20, 2025
@tzjtatata
Copy link

Hi, thank you for debugging. Can you specify the commit version of the transformers? For me, the current main is at 92c5ca9dd70de3ade2af2eb835c96215cc50e815. Is it as same as your version?

@tzjtatata
Copy link

And I found that the newest version of transformers("92c5ca") has bugs when using Qwen2.5-VL.

@LiuRicky
Copy link
Contributor Author

And I found that the newest version of transformers("92c5ca") has bugs when using Qwen2.5-VL.

I guess it is the version 5 days ago. Maybe 8ee50537fe7613b87881cd043a85971c85e99519 or e3d99ec2f58e0e2a4df6b2b41152fdfb3f92a52f

@tzjtatata
Copy link

tzjtatata commented Feb 23, 2025 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants