Fix a bunch of slow tests #8634

LysandreJik · 2020-11-18T22:09:54Z

This PR fixes a bunch of slow tests.

DPR had a few issues, which this PR fixes. There was an issue where the DPRReader object was not using token_type_ids, but for some unknown reason the interaction with the underlying TFBertMainLayer that requires those crashed when using the oh-so-terrifying tf.saved_model.save.

I chose to add the token_type_ids to that model, as imo an additional feature is not a bad idea, even if it wasn't in the original model. I can imagine a bunch of reasons why users might want to have token_type_ids in that model even though it doesn't exist now.

All in all, after an hour of debugging I feel that this is the only way to have the slow tf.saved_model.save test passing on TFDPR. @patrickvonplaten @lhoestq please tell me what you think.

sgugger

LGTM.
Just so I know, who is responsible for the big MODIFY in the test_modeling_tf_dpr file?

patrickvonplaten · 2020-11-18T23:52:38Z

Good for me - thanks a lot for taking care of it!

It would probably all save us a lot of time to find once and for all a good solution for the test_saved_model_with_attentions_output and test_saved_model_with_hidden_states_output functions. I've spent way too much time trying to fix those for TFT5 as well and without finding a good solution. If you have a good idea of how to deal with this functionality/test in the future let me know @LysandreJik :-)

@sgugger - not sure where the MODIFY statements are coming from...I think we can delete it along with return_dict=True now

sgugger · 2020-11-19T00:03:35Z

@patrickvonplaten I tried but then test failed ;-)

patrickvonplaten · 2020-11-19T00:22:48Z

@patrickvonplaten I tried but then test failed ;-)

Hmm maybe @lhoestq has an idea

lhoestq · 2020-11-19T09:12:26Z

The test_modeling_dpr was added recently in #8203
Maybe @ratthachat knows why the # MODIFY are there ?
We should indeed remove them

Also I'm ok with adding token_type_ids since it's a common additional input to models based on bert

ratthachat · 2020-11-19T10:05:52Z

Hi guys, first of all I apologize if there's a problem at the MODIFY tag which is about return_dict argument.

I translated test_modeling_tf_dpr from the Pytorch's one.
If I remember correctly, I found out that there's some tests in test_modeling_tf_common.py
need return_dict=False argument.
(and when I looked at the tests, I judged that all tests just need to ensure the correct values of output,
not about return_dict argument.)
That's why I changed the config to return_dict=False as default, and left the MODIFY comments
just to note that this part was modified from the Pytorch's one.
(Again, I thought the main tests are on outputs' values)

It's my first time to write this kind of test file here, so I apologize again if I made something wrong!

patrickvonplaten · 2020-11-19T10:52:38Z

src/transformers/models/dpr/modeling_tf_dpr.py

    """

-    pooler_output: tf.Tensor
+    pooler_output: tf.Tensor = None


@LysandreJik @sgugger - for some reason, these outputs need to be initialized with something or else our much-loved tests test_compile_tf_model and test_saved_model_with_... fail. Tbh, I don't know why this is - seems to be some weird tf graph problem with positional arguments...-> @sgugger is this fine for you?

Yes, I remember that now. Had to do the same for all tf ModelOutput. This is fine indeed!

Great catch! :-)

patrickvonplaten · 2020-11-19T10:53:24Z

tests/test_modeling_tf_dpr.py

            type_vocab_size=self.type_vocab_size,
            is_decoder=False,
            initializer_range=self.initializer_range,
-            # MODIFY


Deleted all those #MODIFY statements at the cost of having to initialize the pooler_output = None as explained in the other comment

patrickvonplaten · 2020-11-19T10:56:53Z

Hi guys, first of all I apologize if there's a problem at the MODIFY tag which is about return_dict argument.

I translated test_modeling_tf_dpr from the Pytorch's one.
If I remember correctly, I found out that there's some tests in test_modeling_tf_common.py
need return_dict=False argument.
(and when I looked at the tests, I judged that all tests just need to ensure the correct values of output,
not about return_dict argument.)
That's why I changed the config to return_dict=False as default, and left the MODIFY comments
just to note that this part was modified from the Pytorch's one.
(Again, I thought the main tests are on outputs' values)

It's my first time to write this kind of test file here, so I apologize again if I made something wrong!

Absolutely no problem! I should have been more careful when reviewing your PR -> don't worry at all :-)
We also have some difficulties with those test_compile_tf_model tests in TF, so I only understand it too well why you added those return_dict=False/True statements ;-)

If you run into similar problems with TF compilation/ TF graph tests when integrating TFRAG, you can just point it out to us. It's more important to have TFRag fully work in eager mode in the beginning and then we are more then happy to help you out if you encounter problems with graph mode / compilation

ratthachat · 2020-11-19T12:39:19Z

Thanks again for your kind help @patrickvonplaten !!
Yes, as you predicted, there are similar (many more) hacks I did to make TFRag works at the moment.

When submitting PR I will make sure to list everything to you guys :)

LysandreJik · 2020-11-19T15:41:38Z

Thanks for your reviews/comments/fixes!

LysandreJik added 3 commits November 18, 2020 16:34

CI should install sentencepiece

75fe78d

Requiring TF

60b42fd

Fixing some TFDPR bugs

99ac649

LysandreJik requested review from patrickvonplaten and sgugger November 18, 2020 22:09

sgugger approved these changes Nov 18, 2020

View reviewed changes

patrickvonplaten approved these changes Nov 18, 2020

View reviewed changes

remove return_dict=False/True hack

cb3e997

patrickvonplaten reviewed Nov 19, 2020

View reviewed changes

LysandreJik merged commit f2e07e7 into master Nov 19, 2020

LysandreJik deleted the fix-slow-tests branch November 19, 2020 15:41

Fix a bunch of slow tests #8634

Fix a bunch of slow tests #8634

Uh oh!

Conversation

LysandreJik commented Nov 18, 2020

Uh oh!

sgugger left a comment

Choose a reason for hiding this comment

Uh oh!

patrickvonplaten commented Nov 18, 2020

Uh oh!

sgugger commented Nov 19, 2020

Uh oh!

patrickvonplaten commented Nov 19, 2020

Uh oh!

lhoestq commented Nov 19, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ratthachat commented Nov 19, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

patrickvonplaten Nov 19, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

sgugger Nov 19, 2020

Choose a reason for hiding this comment

Uh oh!

sgugger Nov 19, 2020

Choose a reason for hiding this comment

Uh oh!

patrickvonplaten Nov 19, 2020

Choose a reason for hiding this comment

Uh oh!

patrickvonplaten commented Nov 19, 2020

Uh oh!

ratthachat commented Nov 19, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

LysandreJik commented Nov 19, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

lhoestq commented Nov 19, 2020 •

edited

Loading

ratthachat commented Nov 19, 2020 •

edited

Loading

patrickvonplaten Nov 19, 2020 •

edited

Loading

ratthachat commented Nov 19, 2020 •

edited

Loading