[Proposal] Adding new encoder_no_repeat_ngram_size to generate.#9984
Conversation
Blenderbot results seemed off compared to original ParlAI script: `https://parl.ai/projects/recipes/`. Notably the model seems to repeat a lot what was said during the conversation. The actual problem was that `no_repeat_ngram_size` actually applies to the `encoder_input_ids` but HF's `no_repeat_ngram_size` applies to the previously generated ids (within the decoder). The history conversation of blenderbot is within the `encoder` part so that explains why HF's implementation had the repetitions. This fix was focused on blenderbot *not* small and added tests for those because they are quite different in configuration. This change includes: - Adding a new EncoderNoRepeatLogitProcessor. - Adding 1 new arg to `generate` (`encoder_no_repeat_ngram_size`) - Adding 1 new config parameter `encoder_no_repeat_ngram_size`. - Adding 2 tests, one for the pipeline (high level, inputs exhibited repeat behavior, one low level for EncoderNoRepeatLogitProcessor) - Factored NoRepeatLogitProcessor so that logic could be reused. Further work: - Blenderbot conversational pipeline still does not behave correctly as they way input is prepared within the pipeline is still incorrect (follow up PR) - Blenderbot allows the bot to have personas, which is done by prepending "your personna: XXXX" to the input, this could be explored too in a follow up PR. @patrickvonplaten @LysandreJik
There was a problem hiding this comment.
Looks great to me! Thanks so much for diving into this and solving the blenderbot bug!
Like the design very much as discussed offline! Super nice to be able to solve the problem in such a clean way :-)
Left a couple of nits. But overall looks good to me!
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
LysandreJik
left a comment
There was a problem hiding this comment.
LGTM, this is indeed a clean fix. Do we know why our BlenderBot still behaves incorrectly compared to ParlAI?
Regarding personas, this could probably be handled directly in the ConversationalPipeline?
|
Before merging, please take a look at the failing tests. |
I need to look deeper, by default they use FP16 and final scores are still different in order of magnitude (I'm expecting they correspond to different things), but when looking at the full beam searches they still look similar. I've done step by step debugging and scores withing the beam search are super close for a lot of steps.
Yes exactly my opinion. |
| def __call__(self, input_ids: torch.LongTensor, scores: torch.FloatTensor) -> torch.FloatTensor: | ||
| # B x num_beams | ||
| num_hypos = scores.shape[0] | ||
| num_beams = num_hypos // self.batch_size |
There was a problem hiding this comment.
nice - yeah that's safe!
|
@sgugger Can you take a look please? |
|
@LysandreJik figured it out. Its' because of some logic within ConversationPipeline which is invalid for Coming up with a follow-up PR. |
sgugger
left a comment
There was a problem hiding this comment.
LGTM, thanks for adding this!
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
What does this PR do?
Blenderbot results seemed off compared to original ParlAI script:
https://parl.ai/projects/recipes/. Notably the model seemsto repeat a lot what was said during the conversation.
The actual problem was that
no_repeat_ngram_sizeactually appliesto the
encoder_input_idsbut HF'sno_repeat_ngram_sizeappliesto the previously generated ids (within the decoder). The history
conversation of blenderbot is within the
encoderpart so thatexplains why HF's implementation had the repetitions.
This fix was focused on blenderbot not small and added tests
for those because they are quite different in configuration.
This change includes:
generate(encoder_no_repeat_ngram_size)encoder_no_repeat_ngram_size.repeat behavior, one low level for EncoderNoRepeatLogitProcessor)
Further work:
as they way input is prepared within the pipeline is still incorrect
(follow up PR)
prepending "your personna: XXXX" to the input, this could be explored
too in a follow up PR.
Fixes # (issue)
Before submitting
Pull Request section?
to it if that's the case.
documentation guidelines, and
here are tips on formatting docstrings.
Who can review?
@patrickvonplaten
@LysandreJik
Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors which may be interested in your PR.