Mechanism that converts startup_program initializers to BF16#32720
Mechanism that converts startup_program initializers to BF16#32720luotao1 merged 3 commits intoPaddlePaddle:developfrom
Conversation
|
Thanks for your contribution! |
There was a problem hiding this comment.
The name of this function doesn't align with what it does. Please rename it.
There was a problem hiding this comment.
Please add some comment explaining why you need to go through all ops instead of only looking after current op node.
There was a problem hiding this comment.
I added a comment here in the function description,
https://github.com/PaddlePaddle/Paddle/blob/4cfeb39fa5ba2c7964f14efedac7ccad9ab3e118/python/paddle/fluid/contrib/mixed_precision/fp16_utils.py#L220
Do you think I should expand it?
There was a problem hiding this comment.
I rather thought about inplace comment explaining why you set idx = -1 and in which situations you have to search in the way you do it right now. That would be useful note for future developers to understand the behavior.
Actually the comment you wrote is misleading since even you use search_all you are still going through ops set, but from the beginning. I thought about noting why you do it like that, just for future.
There was a problem hiding this comment.
You can also add an assert that without search_all=True the result is an empty array.
|
@luotao1 Could you start your review, please? |
|
@lidanqing-intel Does this PR cherry-pick to release/2.1? |
…addle#32720) * Add casting initializers for bf16 training * Changes after review * Correct test and add comment
…o BF16 (#32720) (#32764) * Add casting initializers for bf16 training * Changes after review * Correct test and add comment Co-authored-by: joanna.wozna.intel <[email protected]>
PR types
New features
PR changes
OPs
Describe
This PR adds a mechanism to BF16 training that converts initializers from startup_program to BF16.
The mechanism is added only to pure_bf16 mode.
The important thing is that if you want to change the initiators to bf16, you need to use the startup_program argument when defining the model in the minimize function.