Log the decoder chosen by GenerationMixin #17196

mapmeld · 2022-05-12T05:56:33Z

What does this PR do?

When calling model.generate(content, log_decoder=True), the PR would log which decoder and warper(s) are actually used in generation.

I have a demo where I show text generated with different options (top_k, typical_p, repetition_penalty, num_beams, etc.). The final chosen decoding strategy is not obvious. It is tricky to test by comparing outputs because a generative model often returns different text on multiple runs.
By design the function tolerates mistakes -- if there is a missing arg (typical_p=0.5 but no do_sample=True) or mismatched value (typical_p=3) or typo'd arg (numBeams=2) then the function silently chooses another decoding strategy. The code does not flag these because the remaining **kwargs are passed to the model.
I believe the logger is the best place to check whether decoding actually happened as expected.

Example usage: https://colab.research.google.com/drive/1DpMnZkSCtZIiaONoxfzYxYI4vgiTNYLN?usp=sharing

~~The first commit is unnecessary thanks to docs for typical decoding #17186~~ Rebased on this PR and adding one additional section to the documentation about typical decoding
I'm open to renaming or removing log_decoder to always do logger.info in these places
If we always do logger.info, I could move logger calls into BeamSearchScorer. Trying to avoid adding too many args
Could use logger.warn if these issues warrant it

Discussion: https://discuss.huggingface.co/t/logging-which-decoder-selected-in-generation/18133

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline,
Pull Request section?
Was this discussed/approved via a Github issue or the forum? Please add a link
to it if that's the case.
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
Did you write any new necessary tests?

HuggingFaceDocBuilderDev · 2022-05-12T06:14:32Z

The documentation is not available anymore as the PR was closed or merged.

patrickvonplaten · 2022-06-03T09:27:25Z

Hey @mapmeld,

Thanks for the PR.

To me, that is a bit too much of an edge-case and I'm not very happy with cluttering the generation code with if - else statements.

@gante @patil-suraj what do you think?

gante · 2022-06-03T14:02:55Z

Hey @mapmeld! Thank you for the PR 👍

I'm also not a fan of all the if statements on a function whose complexity is already over the top. Perhaps we could remove all the if branches, keep the logging statements, but lower their logging level to debug. That way, a user could get all those values by setting the appropriate logging level, and it would be invisible in the vast majority of cases.

WDYT?

mapmeld · 2022-06-03T14:58:57Z

@patrickvonplaten @gante That makes sense to me, log.debug level, no extra argument. I've made a commit for that

patil-suraj · 2022-06-06T10:07:07Z

Agree with @gante 's comment, using logger.debug and getting rid of those if-else statements sounds good to me.
I'm okay with having these loggings to make it more obvious which method is being used, will be useful in debugging IMO.

gante

LGTM 👍

(to get rid of CI errors, rebase with main -- some issues were fixed since you opened the PR)

patrickvonplaten · 2022-06-08T22:28:46Z

Sorry, I think I wasn't super clear in my last message.

Personally, I would prefer to not merge this PR because:

the generation code is already very complex and hard to read (talking about the code-reading part here not what's displayed to the user), don't think adding 5,6 new logger statement lines help here
How do users know that generate should be run in debug mode to display the logging statements - don't think many users will realize this
If the decoding strategy is not obvious, we should improve the docs IMO
If the user doesn't know what top_p does, I don't think she/he would know that a TopPLogitsWarper is -> don't see the added value of displaying the names in a logger here
Also not in line with how we use the logger in other places across the library

mapmeld · 2022-06-10T11:57:57Z

OK, will close then.

If I can suggest changes beyond logging to this section, here are some ideas:

throwing exceptions in the current code if a decoding argument (typical_p) is ignored because of an unusable value or missing companion argument (do_sample=True)
adding an argument to generate() naming the intended decoder, so it is clear in end-user code, and transformers can throw an exception for calls which don't go down the expected path for whatever reason
specific decoding functions to replace the general generate(), where these functions can throw exceptions / using Python type hints / be more useful in code auto-complete tools
implementing typical decoding in TensorFlow so there's more similarity between Torch and TensorFlow code

patrickvonplaten · 2022-06-10T16:48:32Z

Thanks a lot @mapmeld - those are really nice suggestions! Also after some discussion we think it could make a lot of sense to do maybe the following:

If kwargs are passed to generate that don't exist than we throw a warning so a user is well aware if something is misspelled.
Really like the idea of warning the user if an argument is used that cannot be activated - wondering if there is a good approach that would not force us to make a lot of if .... statements in generate. Any ideas how this could be checked in a very concise way?

patrickvonplaten · 2022-06-10T16:48:43Z

Also keen to hear suggestions from @gante :-)

gante · 2022-06-14T10:26:17Z

implementing typical decoding in TensorFlow so there's more similarity between Torch and TensorFlow code

(@mapmeld) Yeah, we are working on it :D TF generate should have a big release soon.

Really like the idea of warning the user if an argument is used that cannot be activated - wondering if there is a good approach that would not force us to make a lot of if .... statements in generate. Any ideas how this could be checked in a very concise way?

(@patrickvonplaten) Without if's and else's, the cleanest solution would possibly be to hold some dictionary with all passed arguments, in addition to a set of accepted arguments for each generation type, and raise an exception with all unexpected arguments (e.g. The passed arguments triggered greedy_search. However, for greedy_search, following arguments are not accepted: top_p. Please check the documentation here [link]). We can actually implement it with a small effort -- the dictionary with all arguments is locals() at the start of the specific generation functions (e.g. greedy_search()) and the set of accepted arguments is the function signature except **model_kwargs. We can get the accepted model_kwargs from the model forward signature (it's not quite, but should be close enough) -- everything else that remains in **model_kwards is an unused parameter and should raise an exception.

WDYT?

patrickvonplaten · 2022-06-17T18:39:27Z

In a first step I was rather thinking about just warning the user if parameters are passed in kwargs that are not used (probs misspelled)

patrickvonplaten · 2022-06-17T18:42:56Z

Adding sub-generation specific logging logic sounds very complex, would be open if we find a clean, concise solution but at the moment I'd like to prevent adding hardcoded lists of which generation parameter is relevant for which sub generation method (also hard to maintain)

patrickvonplaten · 2022-06-17T18:45:39Z

@gante the solution sounds interesting - would need to see a PR for it to fully understand it. The problem I see is that we won't detect unnecessary generation parameters since they are inside logits_processor and logits_warper

patrickvonplaten · 2022-06-17T18:49:20Z

Overall, also just want to say here that IMO two mistakes were made a while back:

We've set defaults for some values which we should have never done IMO (max_length and top_k) have defaults which is quite counter productive for good logging
We have allowed people to set generation parameters inside the config to which the method defaults to - in the aftermath this was too much "black-magic" and not at all visible/understandable for (new) users.

Will be very hard to remedy these things without breaking backward comp, but open to suggestions / comments!

mapmeld · 2022-06-17T18:57:41Z

Would it be possible for us to talk about it in the HF Slack? I would be interested in finding a part of this where I can contribute

patrickvonplaten · 2022-06-23T10:23:42Z

Invited you :-) Let's chat on Slack

LysandreJik requested a review from patrickvonplaten June 1, 2022 10:33

gante approved these changes Jun 6, 2022

View reviewed changes

mapmeld added 4 commits June 6, 2022 17:19

optional logging to know what decoding is happening

4d7ff45

add typical decoding explainer

b6eee96

remove extra whitespace

a81b16e

use logger.debug

ef59c4e

mapmeld closed this Jun 10, 2022

gante mentioned this pull request Jun 17, 2022

max_length and stopping_criteria in generate() #17718

Closed

4 tasks

gante mentioned this pull request Jul 14, 2022

model.generate doesn't validate kwargs #18130

Closed

gante mentioned this pull request Jul 21, 2022

Generate: validate arguments #18218

Closed

Log the decoder chosen by GenerationMixin #17196

Log the decoder chosen by GenerationMixin #17196

Uh oh!

Conversation

mapmeld commented May 12, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Before submitting

Uh oh!

HuggingFaceDocBuilderDev commented May 12, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

patrickvonplaten commented Jun 3, 2022

Uh oh!

gante commented Jun 3, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mapmeld commented Jun 3, 2022

Uh oh!

patil-suraj commented Jun 6, 2022

Uh oh!

gante left a comment

Choose a reason for hiding this comment

Uh oh!

patrickvonplaten commented Jun 8, 2022

Uh oh!

mapmeld commented Jun 10, 2022

Uh oh!

patrickvonplaten commented Jun 10, 2022

Uh oh!

patrickvonplaten commented Jun 10, 2022

Uh oh!

gante commented Jun 14, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

patrickvonplaten commented Jun 17, 2022

Uh oh!

patrickvonplaten commented Jun 17, 2022

Uh oh!

patrickvonplaten commented Jun 17, 2022

Uh oh!

patrickvonplaten commented Jun 17, 2022

Uh oh!

mapmeld commented Jun 17, 2022

Uh oh!

patrickvonplaten commented Jun 23, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

mapmeld commented May 12, 2022 •

edited

Loading

HuggingFaceDocBuilderDev commented May 12, 2022 •

edited

Loading

gante commented Jun 3, 2022 •

edited

Loading

gante commented Jun 14, 2022 •

edited

Loading