Skip to content

Conversation

@AkshitaB
Copy link
Contributor

@AkshitaB AkshitaB commented Apr 4, 2022

Changes proposed in this pull request:

  • transformers::finetune step, which mostly mimics the TorchTrainStep with additions for tokenizing the data (and updating the model embeddings). It also contains model-specific defaults for the data collator.
  • RunGeneration now allows the trained model object as input.

Before submitting

  • I've read and followed all steps in the Making a pull request
    section of the CONTRIBUTING docs.
  • I've updated or added any relevant docstrings following the syntax described in the
    Writing docstrings section of the CONTRIBUTING docs.
  • If this PR fixes a bug, I've added a test that will fail without my fix.
  • If this PR adds a new feature, I've added tests that sufficiently cover my new functionality.

After submitting

  • All GitHub Actions jobs for my pull request have passed.

@AkshitaB AkshitaB marked this pull request as draft April 4, 2022 06:14
@AkshitaB AkshitaB requested review from dirkgr and epwalsh April 4, 2022 21:58
@AkshitaB AkshitaB marked this pull request as ready for review April 12, 2022 18:17


@Step.register("subset-data")
class SubsetData(Step):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have the DatasetRemix step for Tango's DatasetDict. Can we have the same for HF's datasets?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will work on it separately: #268 This is technically unrelated to finetuning.


def run( # type: ignore[override]
self,
model: Lazy[Model],
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I want to do some sort of curriculum learning, can I pass in the output of another training step here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be done once we fix this: #269

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And this: #270

@dirkgr dirkgr enabled auto-merge (squash) April 19, 2022 22:16
@dirkgr dirkgr merged commit 1083049 into main Apr 19, 2022
@dirkgr dirkgr deleted the finetuning branch April 19, 2022 22:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants