Refine multi thread cpu parallel exe#11406
Conversation
b5a1c35 to
606a73b
Compare
606a73b to
5c3ece4
Compare
… refine_multi_thread_CPU_Parallel_exe
24e1890 to
80bb2cf
Compare
80bb2cf to
053ecd6
Compare
… refine_multi_thread_CPU_Parallel_exe
… refine_multi_thread_CPU_Parallel_exe_debug_4
04798dc to
5343ac2
Compare
5343ac2 to
985461b
Compare
3c8a759 to
623f412
Compare
623f412 to
962e8c9
Compare
… refine_multi_thread_CPU_Parallel_exe
043864e to
2af3613
Compare
b9241e1 to
e223397
Compare
… refine_multi_thread_CPU_Parallel_exe
e223397 to
dcce1ff
Compare
| ReduceStrategy reduce_{ReduceStrategy::kAllReduce}; | ||
| GradientScaleStrategy gradient_scale_{GradientScaleStrategy::kCoeffNumDevice}; | ||
|
|
||
| bool share_parameter_between_cards_{false}; |
There was a problem hiding this comment.
How to share parameter between careds when use cuda?
There was a problem hiding this comment.
Why we need another flag instead of ReduceStrategy ?
There was a problem hiding this comment.
How to share parameter between cards when use cuda?
If share_parameter_between_cards_ is true, use_cuda_ must be false and build_strategy.reduce_ must be ReduceStrategy::kReduce. There is a checking:
https://github.com/PaddlePaddle/Paddle/pull/11406/files#diff-564dec854cf4f37015001783f71e06cbR76
There was a problem hiding this comment.
So... Why we need the new flag rather than just use build_strategy.reduce_ ?
The data fields should be ORTHOGONAL.
| OpRole = core.op_proto_and_checker_maker.OpRole | ||
| self._current_role = OpRole.Optimize | ||
| self._op_role_var = [var.name if isinstance(var, Variable) else var] | ||
| self._op_role_var = [ |
There was a problem hiding this comment.
Why we need to store parameters and gradients?
There was a problem hiding this comment.
In this case, fc_0.b_0's gradience name has been changed, so if we still use gradientfc_0.b_0@GRAD to decide this sgd in which device, we will encounter errors.
So, in this PR, I use grad but not GradVarName(params[0]) to get it's belong device.
https://github.com/PaddlePaddle/Paddle/pull/11406/files/7cf836f4ba4353d8ba4247ca09e904098a43edf5#diff-06c27dc69562c2f50b53409969b0a9b5R435
21a39eb to
1cfdd10
Compare
5ec119a to
55cf9ee
Compare
There was a problem hiding this comment.
Why here need to GetNullableAttr? rather than GetAttr()?
There was a problem hiding this comment.
It can be GetAttr() too here.
… refine_multi_thread_CPU_Parallel_exe
55cf9ee to
7c19f38
Compare
* refine multi-thread CPU Parallel exe * refine multi thread CPU Parallel exe * Refine CPU version for ParallelExecutor * add share_parameter_between_cards_ * Fix ParallelExecutor bug * Fix unit test * Fix parameter opt balance * Fix with opti (param->grad) * Add grad to op var * Remove shard_param_between_cards

No description provided.