Enable program passes on Fleet APIs#34955
Conversation
|
Thanks for your contribution! |
|
|
||
| def apply_ir_passes(main_program, startup_program, config): | ||
| build_strategy = config._user_defined_strategy.build_strategy._copy() | ||
| if not paddle.fluid.core.globals()['FLAGS_apply_pass_to_program']: |
There was a problem hiding this comment.
You can use _global_flags() to replace paddle.fluid.core.globals()
paddle/fluid/framework/ir/pass.cc
Outdated
| #include "paddle/fluid/platform/mkldnn_helper.h" | ||
| #endif | ||
|
|
||
| DEFINE_bool(apply_pass_to_program, false, |
| 'bias': 0.0, | ||
| 'bias_after_scale': False | ||
| }) | ||
| new_grad.op._set_attr(op_maker.kOpRoleAttrName(), |
There was a problem hiding this comment.
Use main_program._optimized_guard()?
There was a problem hiding this comment.
Some operators are marked as kBackward. Not applicable to use main_program._optimized_guard().
| if(WITH_DISTRIBUTE) | ||
| set_tests_properties(test_new_group_api PROPERTIES TIMEOUT 120) | ||
| set_tests_properties(test_pipeline PROPERTIES TIMEOUT 120) | ||
| set_tests_properties(test_pipeline PROPERTIES TIMEOUT 240) |
There was a problem hiding this comment.
尽量不要改单测时间,优先看一下是否有优化方法,否则CI的负担会很大
There was a problem hiding this comment.
已对单测进行拆分,但拆分后仍有有一个超时时间为120s的单测test_ir_pass_pipeline。
paddle/fluid/platform/flags.cc
Outdated
| * Fleet APIs. | ||
| * Note: Apply IR pass to program. Be only useful when using Fleet APIs. | ||
| */ | ||
| DEFINE_bool(apply_pass_to_program, false, |
There was a problem hiding this comment.
3e20dc3 to
c123166
Compare
* add fleet api for program pass * turn on apply pass for CI test * fix disable fuse_all_optimizer bug * try to test ci * fix CI * fill unspecified op role * fix fuse_allreduce * add ut to improve coverage * remove useless change * improve c++ coverage * follow some comments * test ir pass pipeline * update doc * reduce ut time again
| block = self.main_program.global_block() | ||
|
|
||
| last_backward_op_idx = None | ||
| for i, op in enumerate(reversed(gm_block.ops)): |
| return | ||
|
|
||
| gm_block._insert_op( | ||
| last_backward_op_idx, |
There was a problem hiding this comment.
last_backward_op_idx + 1吧,插入到最后一个last_backward_op_idx后面,optimize前面。此时last_backward_op_idx的默认值应该是-1
| outputs={'Out': g}, | ||
| attrs={ | ||
| 'ring_id': ring_id, | ||
| OP_ROLE_KEY: OpRole.Backward, |
There was a problem hiding this comment.
这个用Optimize可能更准确些,PE里面放到backward主要可以overlap。不过这个没啥影响就是了,pipeline有自己的gradient merge,其它的也不会用。
There was a problem hiding this comment.
嗯,这个OpRole目前设置太随意了。或者说没有统一规范,依需求来设置...
* add fleet api for program pass * turn on apply pass for CI test * fix disable fuse_all_optimizer bug * try to test ci * fix CI * fill unspecified op role * fix fuse_allreduce * add ut to improve coverage * remove useless change * improve c++ coverage * follow some comments * test ir pass pipeline * update doc * reduce ut time again

PR types
New features
PR changes
Others
Describe
Enable program passes on Fleet APIs. Related doc PR: PaddlePaddle/docs#3854