Skip to content

Conversation

@jdchang1
Copy link

What does this PR do?

Transformers recently added in mean_resizing to resize_token_embeddings. This is breaking with mixed initialization in downstream training tasks that requires adding tokens to Composer Huggingface Models. This PR sets this value to False for now rather than defaulting to True.

@jdchang1 jdchang1 requested a review from a team as a code owner November 20, 2024 19:50
@jdchang1 jdchang1 requested a review from mvpatel2000 November 20, 2024 19:50
f' Resizing the model embeddings to {len(self.tokenizer)} from {self.config.vocab_size}.',
)
self.model.resize_token_embeddings(len(self.tokenizer))
self.model.resize_token_embeddings(len(self.tokenizer), mean_resizing=False)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you'll need to gate on transformers version, or inspect the args to the func before passing this in or something i think

Copy link
Contributor

@dakinggg dakinggg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For posterity/if we want to fix this, could you explain why the mean_resizing doesn't work with meta but the old version does?

Copy link
Contributor

@mvpatel2000 mvpatel2000 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM besides Daniel's comment which will fix tests

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants