Skip to content

Conversation

@schoi-habana
Copy link
Collaborator

@schoi-habana schoi-habana commented Jun 26, 2025

This enables GRPO trainer.
The baseline is #1898
This reproduce the example from HF https://huggingface.co/learn/cookbook/en/fine_tuning_llm_grpo_trl on gaudi

image

Before submitting

  • This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • Did you make sure to update the documentation with your changes?
  • Did you write any new necessary tests?

regisss and others added 30 commits September 2, 2024 16:51
@schoi-habana schoi-habana force-pushed the schoi/grpo_from_pr1898_gradient_ckpt branch from 882b81b to a65a9d6 Compare June 27, 2025 22:58
@schoi-habana schoi-habana marked this pull request as ready for review June 27, 2025 23:05
@schoi-habana schoi-habana requested a review from regisss as a code owner June 27, 2025 23:05
Copy link
Collaborator

@regisss regisss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left a few comments. Can you also add some tests for this new trainer in https://github.com/huggingface/optimum-habana/blob/main/tests/test_trl.py please?

@schoi-habana schoi-habana force-pushed the schoi/grpo_from_pr1898_gradient_ckpt branch from 4c1597f to df5f215 Compare July 8, 2025 22:53
@github-actions
Copy link

github-actions bot commented Jul 8, 2025

The code quality check failed, please run make style.

@schoi-habana schoi-habana force-pushed the schoi/grpo_from_pr1898_gradient_ckpt branch from df5f215 to 6b347df Compare July 8, 2025 22:56
@github-actions
Copy link

github-actions bot commented Jul 8, 2025

The code quality check failed, please run make style.

@schoi-habana schoi-habana force-pushed the schoi/grpo_from_pr1898_gradient_ckpt branch from 6b347df to 86dcc6a Compare July 8, 2025 22:58
@schoi-habana
Copy link
Collaborator Author

I left a few comments. Can you also add some tests for this new trainer in https://github.com/huggingface/optimum-habana/blob/main/tests/test_trl.py please?

@regisss thanks for the review! i will add some tests soon.
the tests need to run with PT_HPU_LAZY_MODE=1. where do you usually add it to tests?

@regisss
Copy link
Collaborator

regisss commented Jul 9, 2025

I left a few comments. Can you also add some tests for this new trainer in https://github.com/huggingface/optimum-habana/blob/main/tests/test_trl.py please?

@regisss thanks for the review! i will add some tests soon. the tests need to run with PT_HPU_LAZY_MODE=1. where do you usually add it to tests?

Right now it is added at the level of the makefile:

export PT_HPU_LAZY_MODE=1

Since CI tests are executed through makefile commands, that's everything we need. However, if you want to run it with pytest, you'll have to pass the env variable explicitly.
Most tests require lazy mode so I think you can just add yours and that should be fine.

@schoi-habana schoi-habana force-pushed the schoi/grpo_from_pr1898_gradient_ckpt branch from 961d0ac to 3c0a6a6 Compare July 9, 2025 20:40
@github-actions
Copy link

github-actions bot commented Jul 9, 2025

The code quality check failed, please run make style.

@schoi-habana schoi-habana force-pushed the schoi/grpo_from_pr1898_gradient_ckpt branch from d6c11d6 to 7962be0 Compare July 9, 2025 21:43
@yafshar yafshar mentioned this pull request Jul 10, 2025
3 tasks
Copy link
Collaborator

@regisss regisss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@regisss regisss merged commit 84bc77c into main Jul 28, 2025
6 of 9 checks passed
@regisss regisss deleted the schoi/grpo_from_pr1898_gradient_ckpt branch July 28, 2025 10:44
@yafshar yafshar mentioned this pull request Jul 28, 2025
3 tasks
astachowiczhabana pushed a commit that referenced this pull request Jul 31, 2025
Signed-off-by: Wang, Yi A <[email protected]>
Signed-off-by: Daniel Socek <[email protected]>
Signed-off-by: U. Artie Eoff <[email protected]>
Signed-off-by: Vivek Kumar <[email protected]>
Signed-off-by: Urszula <[email protected]>
Co-authored-by: regisss <[email protected]>
Co-authored-by: Jimin Ha <[email protected]>
Co-authored-by: Shiv Kaul <[email protected]>
Co-authored-by: Yeonsil Yoon <[email protected]>
Co-authored-by: Harish Subramony <[email protected]>
Co-authored-by: Vidya Galli <[email protected]>
Co-authored-by: Iman Gohari <[email protected]>
Co-authored-by: Bhargav <[email protected]>
Co-authored-by: Chetan Kumar Verma <[email protected]>
Co-authored-by: Sheng Yang <[email protected]>
Co-authored-by: Yaser Afshar <[email protected]>
Co-authored-by: Wang, Yi <[email protected]>
Co-authored-by: Akihiro Takahashi <[email protected]>
Co-authored-by: Edward Mascarenhas <[email protected]>
Co-authored-by: Libin Tang <[email protected]>
Co-authored-by: Sayantan Sarkar <[email protected]>
Co-authored-by: Daniel Socek <[email protected]>
Co-authored-by: Silvia Colabrese <[email protected]>
Co-authored-by: Mieszko Dziadowiec <[email protected]>
Co-authored-by: Dmitry <[email protected]>
Co-authored-by: Urszula Golowicz <[email protected]>
Co-authored-by: Nikolay Protasov <[email protected]>
Co-authored-by: U. Artie Eoff <[email protected]>
Co-authored-by: Luca Calabria <[email protected]>
Co-authored-by: Harshvardhan Chauhan <[email protected]>
Co-authored-by: Shifani Rajabose <[email protected]>
Co-authored-by: Dmitry <[email protected]>
Co-authored-by: Mounika Mandava <[email protected]>
Co-authored-by: ZhengHongming888 <[email protected]>
Co-authored-by: Alexey Fadeev <[email protected]>
Co-authored-by: Vivek Kumar <[email protected]>
Co-authored-by: Vivek Kumar <[email protected]>
Co-authored-by: Ilyas Moutawwakil <[email protected]>
Co-authored-by: Rafal Bogdanowicz <[email protected]>
Co-authored-by: Rafal <[email protected]>
Co-authored-by: Jan Kamiński <[email protected]>
Co-authored-by: Karol Brejna <[email protected]>
Co-authored-by: Piotr Bielak <[email protected]>
Co-authored-by: Piotr Bielak <[email protected]>
Co-authored-by: karol-brejna-i <[email protected]>
Co-authored-by: IlyasMoutawwakil <[email protected]>
astachowiczhabana pushed a commit that referenced this pull request Jul 31, 2025
Signed-off-by: Wang, Yi A <[email protected]>
Signed-off-by: Daniel Socek <[email protected]>
Signed-off-by: U. Artie Eoff <[email protected]>
Signed-off-by: Vivek Kumar <[email protected]>
Signed-off-by: Urszula <[email protected]>
Co-authored-by: regisss <[email protected]>
Co-authored-by: Jimin Ha <[email protected]>
Co-authored-by: Shiv Kaul <[email protected]>
Co-authored-by: Yeonsil Yoon <[email protected]>
Co-authored-by: Harish Subramony <[email protected]>
Co-authored-by: Vidya Galli <[email protected]>
Co-authored-by: Iman Gohari <[email protected]>
Co-authored-by: Bhargav <[email protected]>
Co-authored-by: Chetan Kumar Verma <[email protected]>
Co-authored-by: Sheng Yang <[email protected]>
Co-authored-by: Yaser Afshar <[email protected]>
Co-authored-by: Wang, Yi <[email protected]>
Co-authored-by: Akihiro Takahashi <[email protected]>
Co-authored-by: Edward Mascarenhas <[email protected]>
Co-authored-by: Libin Tang <[email protected]>
Co-authored-by: Sayantan Sarkar <[email protected]>
Co-authored-by: Daniel Socek <[email protected]>
Co-authored-by: Silvia Colabrese <[email protected]>
Co-authored-by: Mieszko Dziadowiec <[email protected]>
Co-authored-by: Dmitry <[email protected]>
Co-authored-by: Urszula Golowicz <[email protected]>
Co-authored-by: Nikolay Protasov <[email protected]>
Co-authored-by: U. Artie Eoff <[email protected]>
Co-authored-by: Luca Calabria <[email protected]>
Co-authored-by: Harshvardhan Chauhan <[email protected]>
Co-authored-by: Shifani Rajabose <[email protected]>
Co-authored-by: Dmitry <[email protected]>
Co-authored-by: Mounika Mandava <[email protected]>
Co-authored-by: ZhengHongming888 <[email protected]>
Co-authored-by: Alexey Fadeev <[email protected]>
Co-authored-by: Vivek Kumar <[email protected]>
Co-authored-by: Vivek Kumar <[email protected]>
Co-authored-by: Ilyas Moutawwakil <[email protected]>
Co-authored-by: Rafal Bogdanowicz <[email protected]>
Co-authored-by: Rafal <[email protected]>
Co-authored-by: Jan Kamiński <[email protected]>
Co-authored-by: Karol Brejna <[email protected]>
Co-authored-by: Piotr Bielak <[email protected]>
Co-authored-by: Piotr Bielak <[email protected]>
Co-authored-by: karol-brejna-i <[email protected]>
Co-authored-by: IlyasMoutawwakil <[email protected]>
astachowiczhabana pushed a commit that referenced this pull request Sep 10, 2025
Signed-off-by: Wang, Yi A <[email protected]>
Signed-off-by: Daniel Socek <[email protected]>
Signed-off-by: U. Artie Eoff <[email protected]>
Signed-off-by: Vivek Kumar <[email protected]>
Signed-off-by: Urszula <[email protected]>
Co-authored-by: regisss <[email protected]>
Co-authored-by: Jimin Ha <[email protected]>
Co-authored-by: Shiv Kaul <[email protected]>
Co-authored-by: Yeonsil Yoon <[email protected]>
Co-authored-by: Harish Subramony <[email protected]>
Co-authored-by: Vidya Galli <[email protected]>
Co-authored-by: Iman Gohari <[email protected]>
Co-authored-by: Bhargav <[email protected]>
Co-authored-by: Chetan Kumar Verma <[email protected]>
Co-authored-by: Sheng Yang <[email protected]>
Co-authored-by: Yaser Afshar <[email protected]>
Co-authored-by: Wang, Yi <[email protected]>
Co-authored-by: Akihiro Takahashi <[email protected]>
Co-authored-by: Edward Mascarenhas <[email protected]>
Co-authored-by: Libin Tang <[email protected]>
Co-authored-by: Sayantan Sarkar <[email protected]>
Co-authored-by: Daniel Socek <[email protected]>
Co-authored-by: Silvia Colabrese <[email protected]>
Co-authored-by: Mieszko Dziadowiec <[email protected]>
Co-authored-by: Dmitry <[email protected]>
Co-authored-by: Urszula Golowicz <[email protected]>
Co-authored-by: Nikolay Protasov <[email protected]>
Co-authored-by: U. Artie Eoff <[email protected]>
Co-authored-by: Luca Calabria <[email protected]>
Co-authored-by: Harshvardhan Chauhan <[email protected]>
Co-authored-by: Shifani Rajabose <[email protected]>
Co-authored-by: Dmitry <[email protected]>
Co-authored-by: Mounika Mandava <[email protected]>
Co-authored-by: ZhengHongming888 <[email protected]>
Co-authored-by: Alexey Fadeev <[email protected]>
Co-authored-by: Vivek Kumar <[email protected]>
Co-authored-by: Vivek Kumar <[email protected]>
Co-authored-by: Ilyas Moutawwakil <[email protected]>
Co-authored-by: Rafal Bogdanowicz <[email protected]>
Co-authored-by: Rafal <[email protected]>
Co-authored-by: Jan Kamiński <[email protected]>
Co-authored-by: Karol Brejna <[email protected]>
Co-authored-by: Piotr Bielak <[email protected]>
Co-authored-by: Piotr Bielak <[email protected]>
Co-authored-by: karol-brejna-i <[email protected]>
Co-authored-by: IlyasMoutawwakil <[email protected]>
gplutop7 pushed a commit to HabanaAI/optimum-habana-fork that referenced this pull request Oct 15, 2025
Signed-off-by: Wang, Yi A <[email protected]>
Signed-off-by: Daniel Socek <[email protected]>
Signed-off-by: U. Artie Eoff <[email protected]>
Signed-off-by: Vivek Kumar <[email protected]>
Signed-off-by: Urszula <[email protected]>
Co-authored-by: Sun Choi <[email protected]>
Co-authored-by: regisss <[email protected]>
Co-authored-by: Jimin Ha <[email protected]>
Co-authored-by: Shiv Kaul <[email protected]>
Co-authored-by: Yeonsil Yoon <[email protected]>
Co-authored-by: Harish Subramony <[email protected]>
Co-authored-by: Vidya Galli <[email protected]>
Co-authored-by: Iman Gohari <[email protected]>
Co-authored-by: Bhargav <[email protected]>
Co-authored-by: Chetan Kumar Verma <[email protected]>
Co-authored-by: Sheng Yang <[email protected]>
Co-authored-by: Yaser Afshar <[email protected]>
Co-authored-by: Wang, Yi <[email protected]>
Co-authored-by: Akihiro Takahashi <[email protected]>
Co-authored-by: Edward Mascarenhas <[email protected]>
Co-authored-by: Libin Tang <[email protected]>
Co-authored-by: Sayantan Sarkar <[email protected]>
Co-authored-by: Daniel Socek <[email protected]>
Co-authored-by: Silvia Colabrese <[email protected]>
Co-authored-by: Mieszko Dziadowiec <[email protected]>
Co-authored-by: Dmitry <[email protected]>
Co-authored-by: Urszula Golowicz <[email protected]>
Co-authored-by: Nikolay Protasov <[email protected]>
Co-authored-by: U. Artie Eoff <[email protected]>
Co-authored-by: Luca Calabria <[email protected]>
Co-authored-by: Harshvardhan Chauhan <[email protected]>
Co-authored-by: Shifani Rajabose <[email protected]>
Co-authored-by: Dmitry <[email protected]>
Co-authored-by: Mounika Mandava <[email protected]>
Co-authored-by: ZhengHongming888 <[email protected]>
Co-authored-by: Alexey Fadeev <[email protected]>
Co-authored-by: Vivek Kumar <[email protected]>
Co-authored-by: Vivek Kumar <[email protected]>
Co-authored-by: Ilyas Moutawwakil <[email protected]>
Co-authored-by: Rafal Bogdanowicz <[email protected]>
Co-authored-by: Rafal <[email protected]>
Co-authored-by: Jan Kamiński <[email protected]>
Co-authored-by: Karol Brejna <[email protected]>
Co-authored-by: Piotr Bielak <[email protected]>
Co-authored-by: Piotr Bielak <[email protected]>
Co-authored-by: karol-brejna-i <[email protected]>
Co-authored-by: IlyasMoutawwakil <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.