Design doc: Batch Normalization Operator#3748
Design doc: Batch Normalization Operator#3748jacquesqiao merged 4 commits intoPaddlePaddle:developfrom
Conversation
| use_global_est = False, | ||
| epsilon = 1e-6, | ||
| momentum = 0.99): | ||
| mean_cache = scope.new_var(name = 'estimated_mean', trainable = False) |
There was a problem hiding this comment.
There might be more than one batch_norm_op in the same topology, make sure variables like mean_cache and others have unique names?
There was a problem hiding this comment.
make sure variables such as mean_cache invisible to other operators, or may be two operators write it.
There was a problem hiding this comment.
- Yes. we need to make sure
mean_cachehas the unique name. mean_cacheis defined insidedef batch_norm_layer, could this make sure it's invisible for other operators?
paddle/operators/batch_norm_op.md
Outdated
| # ... | ||
| ``` | ||
|
|
||
| `is_infer` is an attribute. Once an operator is created, its attributes can not be changed. It suggests us that we shall maintain two `batch_norm_op` in the model, one's `is_infer` is `True`(we call it `infer_batch_norm_op`) and the other one's is `False`(we call it `train_batch_norm_op`). They share all parameters and variables. How to organize them is related with Python API design, so I leave it here for further discussion. |
There was a problem hiding this comment.
Currently, different Block shares the same operator objects. Did this require that different blocks has their own operator objects?
Add some details here?
There was a problem hiding this comment.
We do need two distinct batch_norm_op. And it seems really require blocks hold their own operator objects... We shall have more discussion on it.
|
This design is temporarily placed on hold for it's strongly related to Python API design. |
Related: #3684
Here may be easier to read.
Some part of
batch_norm_opdesign is strongly related with Python API and needs future discussions.