Upgrade to TRL 0.17.0 version #2130

alekseyfa · 2025-07-10T14:42:41Z

What does this PR do?

This request is a transition to TRL version 0.17.0 from 0.9.6, which implies an update of the algorithms already existing in optimum-habana algorithms: SFT, DPO and PPO.

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you make sure to update the documentation with your changes?
Did you write any new necessary tests?

|

yafshar · 2025-07-10T14:51:00Z

@alekseyfa please check the other PR #2088 as well

This reverts commit c1d2386.

imangohari1 · 2025-07-23T22:25:18Z

Hi @alekseyfa
please make sure we rebase this PR with OH main and test it with latest diffuser update. Thanks.

yafshar · 2025-07-28T18:31:43Z

@alekseyfa what else is missing from this draft to finish it?

alekseyfa · 2025-07-29T13:40:11Z

@alekseyfa what else is missing from this draft to finish it?

The PPO needs to be finished; it is not functioning properly at the moment. However, reviewing of SFT and DPO can be started to speed up the process

pbielak · 2025-09-08T09:01:32Z

@alekseyfa Could you please provide an update, i.e., do you plan to continue the implementation of this PR / do you have an ETA when it will be finished?

alekseyfa added 7 commits July 10, 2025 12:53

Migration of sft to trl version 0.17

7537460

Updated requirements

182b8bc

Updated comments|

8282912

|

Migration of DPO to trl version 0.17.0

f386c79

Moved DataCollator base to utils

ac9b392

Migration of PPO to TRL version 0.17

0273fd0

Updated README

47e72b2

karol-brejna-i assigned pbielak Jul 11, 2025

alekseyfa added 4 commits July 16, 2025 15:29

Getting argumets according DPO config

c1d2386

Revert "Getting argumets according DPO config"

e9b23c2

This reverts commit c1d2386.

Small fixes in DPO trainer

6aaf449

update DPO example to match TRL version

05d8bec

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Upgrade to TRL 0.17.0 version #2130

Upgrade to TRL 0.17.0 version #2130

Uh oh!

alekseyfa commented Jul 10, 2025 •

edited

Loading

Uh oh!

yafshar commented Jul 10, 2025

Uh oh!

imangohari1 commented Jul 23, 2025

Uh oh!

yafshar commented Jul 28, 2025

Uh oh!

alekseyfa commented Jul 29, 2025

Uh oh!

pbielak commented Sep 8, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Upgrade to TRL 0.17.0 version #2130

Are you sure you want to change the base?

Upgrade to TRL 0.17.0 version #2130

Uh oh!

Conversation

alekseyfa commented Jul 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Before submitting

Uh oh!

yafshar commented Jul 10, 2025

Uh oh!

imangohari1 commented Jul 23, 2025

Uh oh!

yafshar commented Jul 28, 2025

Uh oh!

alekseyfa commented Jul 29, 2025

Uh oh!

pbielak commented Sep 8, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

alekseyfa commented Jul 10, 2025 •

edited

Loading