Part I: Construct runtime graph #1

LiYuRio · 2021-09-14T07:27:07Z

构建运行时的执行图，同时考虑自动并行和异构PS的需求，用户可指定Program按照“细粒度”或“粗粒度”进行切分，这个pr主要实现“粗粒度”划分的构图逻辑，“粗粒度”是block级别的划分，运行时可以严格实现“IFIB”的调度逻辑；“细粒度”是op级别的划分。

将Program按功能划分成多个block，Feed、前向、后向、通信、Optimizer、Fetch，自动并行场景下暂时以op_role和op_name作为划分依据。
构建Runtime Graph，从最后一个block开始向前遍历，计算VarNode的依赖关系，创建VarNode和TaskNode，并在Node内部记录上下游的依赖关系。

待确定：

执行器的名字
Feed，reader两种源节点之间的关系
和PS场景下兼容，有可以跑起来的例子。

gongweibao · 2021-09-17T08:01:48Z

paddle/fluid/framework/event_based_executor.cc

+void EventBasedExecutor::Compile(const ProgramDesc& program,
+                                 const std::string& grain) {
+  if (grain == "coarse") {
+    CompileCoarseGrainGraph(program);


gongweibao · 2021-09-17T08:31:10Z

paddle/fluid/framework/event_based_executor.h

+// See the License for the specific language governing permissions and
+// limitations under the License.
+
+#pragma once


放到fleet下,比如sectionworker平级。

…y::Allocation> for Storage (PaddlePaddle#38301) * Added shared_ptr<Allocation> member & corresponding interfaces to Storage * Removed original pten::Allocation from Storage and adjusted the interfaces accordingly * Fixed issues with storage offset * Used place to malloc allocation for TensorStorage

* Enabled Eager OpTest #1 * Enabled Eager OpTest #1 * Fixed get_tensor method for EagerTensor

* #1 migrate dist-related type()-> dtype() * move datatype function from pten -> fluid/framework * change type() in imperative into convert(dtype()) * modify xx_tensor->type into xx_tensor->dtype * change the set_type interface and the caller * modify xx_tensor.type into xx_tensor.dtype * fix mutable_data(place, dtype()) * change caller of mutable_data in pten and distributed * change the caller of mutable_data in fluid/framework * change the caller of mutable_data in imperative directory * mutable_data: inference * update the call of mutable_data * transfer MakePenScalarArray MakePtenScalar ResetHolderWithType * pass the compile. the next step is remove VarType in Pten * fix all and remove VarType from pten. success in linux. Next task is other platform * fix conflict with develop * fix compiled error * Fix reset conversion * fix conflict * fix compiled problem * fix typo * Fix << in tensor_utils.cc * fix type->dtype * fix unittest * fix tensor init constructor * fix DataTypeSize for BFloat16 * fix code style * fix npu compiled error * fix npu * compile npu sucessfully * fix conflict * fix conflict Co-authored-by: xiongkun <[email protected]>

* [Refactor] refactored eager_gen.py PR #1 * [Refactor] refactored eager_gen.py PR #1 * Refactored version 2 * Added automatic code generation utils * Fixed merge issues

…rdFunctions and GradNodes (PaddlePaddle#40937) * [Refactor] refactored eager_gen.py PR #2 * [DoubleGrad PR #1] Decoupled code generation logics for Dygraph ForwardFunctions and GradNodes * Fixed minor issue

…enerateForwardDefinition (PaddlePaddle#41016) * [Refactor] refactored eager_gen.py PR #2 * [DoubleGrad PR #1] Decoupled code generation logics for Dygraph ForwardFunctions and GradNodes * Fixed minor issue * Adjusted logics of GenerateNodeCreationCodes and GenerateForwardDefinition * Fixed issues * Fixed minor issue

…Paddle#41051) * [Refactor] refactored eager_gen.py PR #2 * [DoubleGrad PR #1] Decoupled code generation logics for Dygraph ForwardFunctions and GradNodes * Fixed minor issue * Adjusted logics of GenerateNodeCreationCodes and GenerateForwardDefinition * Fixed issues * Supported higher-order grad node generation * [DoubleGrad PR #4] Supported higher-order GradNode generation * Fixed yaml typo

…e#41121) * [Refactor] refactored eager_gen.py PR #2 * [DoubleGrad PR #1] Decoupled code generation logics for Dygraph ForwardFunctions and GradNodes * Fixed minor issue * Adjusted logics of GenerateNodeCreationCodes and GenerateForwardDefinition * Fixed issues * Supported higher-order grad node generation * [DoubleGrad PR #4] Supported higher-order GradNode generation * [DoubleGrad #4] Bug Fixes to Double Grad Node Generation * Fixed yaml typo * Fixed yaml typo * fixed minor issues * Fixed minor issue

…sed to paddle.grad() (PaddlePaddle#41198) * [Refactor] refactored eager_gen.py PR #2 * [DoubleGrad PR #1] Decoupled code generation logics for Dygraph ForwardFunctions and GradNodes * Fixed minor issue * Adjusted logics of GenerateNodeCreationCodes and GenerateForwardDefinition * Fixed issues * Supported higher-order grad node generation * [DoubleGrad PR #4] Supported higher-order GradNode generation * [DoubleGrad #4] Bug Fixes to Double Grad Node Generation * Fixed yaml typo * Fixed yaml typo * fixed minor issues * [DoubleGrad PR #5] Enabled gradient computations for grad_tensors passed to paddle.grad() * Fixed minor issue * Fixed CI-Inference issue * Fixed CI-inference issues

…rd run (PaddlePaddle#41306) * [Refactor] refactored eager_gen.py PR #2 * [DoubleGrad PR #1] Decoupled code generation logics for Dygraph ForwardFunctions and GradNodes * Fixed minor issue * Adjusted logics of GenerateNodeCreationCodes and GenerateForwardDefinition * Fixed issues * Supported higher-order grad node generation * [DoubleGrad PR #4] Supported higher-order GradNode generation * [DoubleGrad #4] Bug Fixes to Double Grad Node Generation * Fixed yaml typo * Fixed yaml typo * fixed minor issues * [DoubleGrad PR #5] Enabled gradient computations for grad_tensors passed to paddle.grad() * Fixed minor issue * Fixed CI-Inference issue * Fixed CI-inference issues * [DoubleGrad PR #7] paddle.grad() to copy backward graph before backward run * Fixed minor issues * Fixed issue with backward graph construction logic * Fixed implementation issues with backward graph reconstruction * Fixed unittest issue * Fixed issues

…ePaddle#41387) * [Refactor] refactored eager_gen.py PR #2 * [DoubleGrad PR #1] Decoupled code generation logics for Dygraph ForwardFunctions and GradNodes * Fixed minor issue * Adjusted logics of GenerateNodeCreationCodes and GenerateForwardDefinition * Fixed issues * Supported higher-order grad node generation * [DoubleGrad PR #4] Supported higher-order GradNode generation * [DoubleGrad #4] Bug Fixes to Double Grad Node Generation * Fixed yaml typo * Fixed yaml typo * fixed minor issues * [DoubleGrad PR #5] Enabled gradient computations for grad_tensors passed to paddle.grad() * Fixed minor issue * Fixed CI-Inference issue * Fixed CI-inference issues * [DoubleGrad PR #7] paddle.grad() to copy backward graph before backward run * Fixed minor issues * Fixed issue with backward graph construction logic * Fixed implementation issues with backward graph reconstruction * Fixed unittest issue * Fixed issues * [DoubleGrad PR #8] Enabled triple grads for sigmoid and matmul * Fixed issues with phi kernel * Added triple grad test case * Fixed minor issue

compile phase for async executor

af16215

LiYuRio force-pushed the dev_executor branch from 26d5923 to af16215 Compare September 15, 2021 03:45

gongweibao reviewed Sep 17, 2021

View reviewed changes

FeixLiu pushed a commit that referenced this pull request Nov 29, 2021

Added Eager Dygraph AutoCodeGen dependencies #1 (PaddlePaddle#37574)

fcd44b5

FeixLiu pushed a commit that referenced this pull request Nov 29, 2021

Added performance tests for Eager Dygraph #1 (PaddlePaddle#37638)

7df301f

LiYuRio pushed a commit that referenced this pull request Feb 21, 2022

infershaped autogen (PR #1), test=develop (PaddlePaddle#39405)

b3e049f

LiYuRio pushed a commit that referenced this pull request Feb 21, 2022

Fixed get_tensor method for EagerTensor (PaddlePaddle#39414)

9722994

* Enabled Eager OpTest #1 * Enabled Eager OpTest #1 * Fixed get_tensor method for EagerTensor

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Part I: Construct runtime graph #1

Part I: Construct runtime graph #1

Uh oh!

LiYuRio commented Sep 14, 2021 •

edited

Loading

Uh oh!

gongweibao Sep 17, 2021

Uh oh!

gongweibao Sep 17, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Part I: Construct runtime graph #1

Are you sure you want to change the base?

Part I: Construct runtime graph #1

Uh oh!

Conversation

LiYuRio commented Sep 14, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gongweibao Sep 17, 2021

Choose a reason for hiding this comment

Uh oh!

gongweibao Sep 17, 2021

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

LiYuRio commented Sep 14, 2021 •

edited

Loading