Skip to content

Use new parametrization based weight norm if available#24030

Merged
sgugger merged 3 commits intohuggingface:mainfrom
ezyang:weight-norm
Jun 6, 2023
Merged

Use new parametrization based weight norm if available#24030
sgugger merged 3 commits intohuggingface:mainfrom
ezyang:weight-norm

Conversation

@ezyang
Copy link
Copy Markdown
Contributor

@ezyang ezyang commented Jun 5, 2023

What does this PR do?

In pytorch/pytorch#103001 I introduce a new
parametrization based version of weight_norm. One big benefit of
the new API is that the resulting model is deepcopy'able; today, you can't
deepcopy Wav2Vec2 models.

Since the new API isn't even in PyTorch main yet, I'd like to feature
gate it here, so that it gets used whenever PyTorch is recent enough to
support it. It would be a big help for me if you could take this change
earlier rather than later; otherwise I will have to patch transformers
in our own CI to get our benchmark harness working on Wav2Vec2.

Signed-off-by: Edward Z. Yang ezyang@meta.com

Before submitting

  • This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • Did you read the contributor guideline,
    Pull Request section?
  • Was this discussed/approved via a Github issue or the forum? Please add a link
    to it if that's the case.
  • Did you make sure to update the documentation with your changes? Here are the
    documentation guidelines, and
    here are tips on formatting docstrings.
  • Did you write any new necessary tests?

Who can review?

cc @sanchit-gandhi

See pytorch/pytorch#103001

Signed-off-by: Edward Z. Yang <ezyang@meta.com>
Copy link
Copy Markdown
Collaborator

@sgugger sgugger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM as it's properly gated. @sanchit-gandhi can you have a second look?
You'll just need to run make style and make fix-copies on your branch to make the CI happy.

@HuggingFaceDocBuilderDev
Copy link
Copy Markdown

HuggingFaceDocBuilderDev commented Jun 5, 2023

The documentation is not available anymore as the PR was closed or merged.

ezyang added 2 commits June 5, 2023 13:49
Signed-off-by: Edward Z. Yang <ezyang@meta.com>
Signed-off-by: Edward Z. Yang <ezyang@meta.com>
@ezyang
Copy link
Copy Markdown
Contributor Author

ezyang commented Jun 6, 2023

PyTorch side PR has landed!

Copy link
Copy Markdown
Contributor

@sanchit-gandhi sanchit-gandhi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM too - thanks for the PR @ezyang! Here's the PR in the PyTorch repo for reference: pytorch/pytorch#103001

@sgugger sgugger merged commit bc9ecef into huggingface:main Jun 6, 2023
novice03 pushed a commit to novice03/transformers that referenced this pull request Jun 23, 2023
…4030)

* Use new parametrization based weight norm if available

See pytorch/pytorch#103001

Signed-off-by: Edward Z. Yang <ezyang@meta.com>

* handle copies

Signed-off-by: Edward Z. Yang <ezyang@meta.com>

* black

Signed-off-by: Edward Z. Yang <ezyang@meta.com>

---------

Signed-off-by: Edward Z. Yang <ezyang@meta.com>
@ydshieh
Copy link
Copy Markdown
Collaborator

ydshieh commented Jun 28, 2023

@ezyang

With nightly pytorch, we get

AttributeError: encoder.pos_conv_embed.conv.weight_v not found in PyTorch model

when trying to load a pytorch model into a TF model.

The TF model is looking for encoder.pos_conv_embed.conv.weight_v but we now have encoder.pos_conv_embed.conv.parametrizations.weight.original0. (This is from our (TF)Wav2Vec2Model model class).

Question: In your PR pytorch/pytorch#103001, is this part

def _weight_norm_compat_hook()

that deals with the backward compatibility? If so, we will copy it :-)

ydshieh added a commit that referenced this pull request Jun 28, 2023
* fix

* fix

* fix

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
@ezyang
Copy link
Copy Markdown
Contributor Author

ezyang commented Jun 28, 2023

Yep. The change here is not FC so the ingester needs updating.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants