Better filtering of the model outputs in Trainer #8633
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
What does this PR do?
As discovered since merging #8530, sometimes (e.g. when using nvidia apex with the O2 optimization) the new model outputs lose their type and become regular dictionaries. This means we can't index into them with integers and some rework in the internals of
Trainerhas become necessary.This PR:
but it also takes advantage of the new dict outputs to better filter the outputs at inference. We had several issues recently when using models outputing past states (such as Reformer, XLNet, GPT-2) during evaluation in
Trainer. This PR introduces a new API that looks at a possible key in the config of the model to get some attributes to ignore in the ouputs during evaluation (those outputs are then discarded from the predictions returned by the functionTrainer.predictor passed along to metric computation inTrainer.evaluate). Since a user might have some use cases where they want to ignore more keys or output those keys, a new argument is added to bothTrainer.predictandTrainer.evaluateto fully control the keys ignored in those dictionaries.If the model outputs tuple, this is all ignored.
Fixes #8523 among others