-
Notifications
You must be signed in to change notification settings - Fork 31.4k
early exit #34244
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
early exit #34244
Conversation
ArthurZucker
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only missing docs and test, super super nice otherwise!
| inputs_tensor: Optional[torch.Tensor] = None, | ||
| logits_processor: "LogitsProcessorList" = None, | ||
| ): | ||
| # TODO(joao): somehow check whether the model supports early exit |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we add _supports_early_exist ? hasattr(model, "active_layers")
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO it depends on how the model is structured/trained 👀
- if the model is expected to have early exit at ANY layer because the lm head is compatible with all layers -> there is no way to detect unless we manually add an argument in the config, which is... brittle. Probably I would suggest to not do any check for now?
- If the model is expected to have early exit on specific layers, store those layers in the config and check that attribute here.
WDYT?
* 😅 * early exit (#34244) * mvp * docs and tests * a few fixes * no shared cache * Apply suggestions from code review Co-authored-by: Mostafa Elhoushi <[email protected]> * docs * make fix-copies * cohere fix * [test all] * [test all] consistent model code copies * [test all] make fix-copies :D * Apply suggestions from code review Co-authored-by: Pedro Cuenca <[email protected]> Co-authored-by: Mostafa Elhoushi <[email protected]> * Update src/transformers/generation/candidate_generator.py * Update src/transformers/generation/configuration_utils.py Co-authored-by: Pedro Cuenca <[email protected]> * [test all] don't use a stand-alone attribute; fix test --------- Co-authored-by: Joao Gante <[email protected]> Co-authored-by: Joao Gante <[email protected]> Co-authored-by: Mostafa Elhoushi <[email protected]> Co-authored-by: Pedro Cuenca <[email protected]>
* 😅 * early exit (huggingface#34244) * mvp * docs and tests * a few fixes * no shared cache * Apply suggestions from code review Co-authored-by: Mostafa Elhoushi <[email protected]> * docs * make fix-copies * cohere fix * [test all] * [test all] consistent model code copies * [test all] make fix-copies :D * Apply suggestions from code review Co-authored-by: Pedro Cuenca <[email protected]> Co-authored-by: Mostafa Elhoushi <[email protected]> * Update src/transformers/generation/candidate_generator.py * Update src/transformers/generation/configuration_utils.py Co-authored-by: Pedro Cuenca <[email protected]> * [test all] don't use a stand-alone attribute; fix test --------- Co-authored-by: Joao Gante <[email protected]> Co-authored-by: Joao Gante <[email protected]> Co-authored-by: Mostafa Elhoushi <[email protected]> Co-authored-by: Pedro Cuenca <[email protected]>
What does this PR do?