Make fuse_optimizer_op_pass also work when the model contains sparse gradients. by chengduoZH · Pull Request #18664 · PaddlePaddle/Paddle

chengduoZH · 2019-07-17T05:23:14Z

Make fuse_optimizer_op_pass also work when the model contains sparse gradients.
Polish the code which are used by fuse_optimizer_op_pass support.

test=develop

gongweibao · 2019-07-18T06:33:27Z

paddle/fluid/framework/ir/coalesce_grad_tensor_pass.cc

+        result.Get<details::ParamsAndGrads>(details::kParamsAndSparseGrads);
+
+    for (auto &param_grad : params_grads) {
+      if (IsSupportedVarType(GetTypeOfVar(vars_info, param_grad.second))) {


IsLodTensorVartype or IsDenseGradVarType

gongweibao · 2019-07-18T06:49:05Z

paddle/fluid/framework/ir/fuse_optimizer_ops_pass/fuse_optimizer_op_pass.cc

+    if (node->Op()->Type() == fuse_op_type) {
+      auto grad_name = node->Op()->Input(kGrad);
+      PADDLE_ENFORCE_EQ(grad_name.size(), static_cast<size_t>(1));
+      if (GettypeOfVar(vars_info, grad_name[0]) == proto::VarType::LOD_TENSOR) {


IsDenseGradVarType

gongweibao · 2019-07-18T08:26:24Z

paddle/fluid/framework/ir/fuse_optimizer_ops_pass/fuse_optimizer_op_pass.cc

  const std::string prefix(details::kFusedVarNamePrefix);
-  // NOTE: the fused_var_name should be unique.
  for (auto &var_name : aux_var_names) {
+    // NOTE: the fused_var_name should be unique.


How to guarantee it?

Line 81 is used to check this.

test=develop

chengduoZH · 2019-07-18T08:34:54Z

paddle/fluid/framework/ir/fuse_optimizer_ops_pass/fuse_optimizer_op_pass.cc

  const std::string prefix(details::kFusedVarNamePrefix);
-  // NOTE: the fused_var_name should be unique.
  for (auto &var_name : aux_var_names) {
+    // NOTE: the fused_var_name should be unique.


Line 81 is used to check this.

chengduoZH · 2019-07-18T08:37:12Z

paddle/fluid/framework/ir/pass.cc

                 "The VarDescs of persistable variable are not consistency.");
-  PADDLE_ENFORCE(graph == native_graph,
-                 "Pass::Apply() cannot delete the passed graph and shouldn't "
-                 "return a new graph.(For the need of pybind11)");


This check is unnecessary.

chengduoZH · 2019-07-18T08:44:28Z

paddle/fluid/framework/ir/coalesce_grad_tensor_pass.cc

+        result.Get<details::ParamsAndGrads>(details::kParamsAndSparseGrads);
+
+    for (auto &param_grad : params_grads) {
+      if (IsSupportedVarType(GetTypeOfVar(vars_info, param_grad.second))) {


chengduoZH · 2019-07-18T08:44:36Z

paddle/fluid/framework/ir/fuse_optimizer_ops_pass/fuse_optimizer_op_pass.cc

+    if (node->Op()->Type() == fuse_op_type) {
+      auto grad_name = node->Op()->Input(kGrad);
+      PADDLE_ENFORCE_EQ(grad_name.size(), static_cast<size_t>(1));
+      if (GettypeOfVar(vars_info, grad_name[0]) == proto::VarType::LOD_TENSOR) {


gongweibao · 2019-07-18T09:03:59Z

paddle/fluid/framework/ir/fuse_optimizer_ops_pass/fuse_optimizer_op_pass.cc

-  if (result.Has(details::kParamsAndGrads)) {
-    auto &params_grads =
-        result.Get<details::ParamsAndGrads>(details::kParamsAndGrads);
+  if (result.Has(details::kParamsAndDenseGrads)) {


This nested if is too long.

gongweibao

LGTM

test=develop

chengduoZH · 2019-07-22T01:59:58Z

paddle/fluid/framework/details/scope_buffered_ssa_graph_executor.cc

+        }
+      }
+    }
+  }


Define the fused variables in the local execution scope.
Because for some model, there may be more than one program, and those programs may share some parameters, for the previous strategy, the gradients of the shared parameters of those programs are also shared, But this is somewhat problematic, so we should define those fused variables of gradients in the local execution scope.

Copy these line to Comments may be better
And which is the unit test?

test_graph.py

gongweibao

LGTM

test=develop

gongweibao

LGTM

Xreki

LGTM

chengduozh added 2 commits July 17, 2019 13:13

support sparse gradients

8927dc6

test=develop

polish code

e5414b1

test=develop

chengduoZH requested review from Xreki and gongweibao July 17, 2019 08:22

chengduozh added 3 commits July 17, 2019 17:14

update alloc_continuous_space_for_grad_pass

e3fbe6e

test=develop

use coalesce_tensor

8ca4734

test=develop

polish unit test

de2dc0a

test=develop

chengduoZH force-pushed the support_sparse_gradient branch 2 times, most recently from 7bdea4e to 4a73988 Compare July 17, 2019 12:51

open fuse_all_optimizer_ops

419f342

test=develop

chengduoZH force-pushed the support_sparse_gradient branch from 4a73988 to 419f342 Compare July 18, 2019 05:22

chengduoZH changed the title ~~Support sparse gradients for fuse_optimizer_op_pass~~ Make fuse_optimizer_op_pass also work when the model contains sparse gradients. Jul 18, 2019

gongweibao requested changes Jul 18, 2019

View reviewed changes

follow comments

3d011e7

test=develop

chengduoZH commented Jul 18, 2019

View reviewed changes

gongweibao reviewed Jul 18, 2019

View reviewed changes

chengduoZH force-pushed the support_sparse_gradient branch from 864393f to 3d011e7 Compare July 18, 2019 09:45

gongweibao previously approved these changes Jul 18, 2019

View reviewed changes

polish code

3e59e2f

test=develop

chengduoZH dismissed gongweibao’s stale review via 3e59e2f July 21, 2019 01:17

chengduoZH force-pushed the support_sparse_gradient branch from 095c018 to 464b882 Compare July 21, 2019 04:56

disable fuse optimization op in test_buffer_shared_inplace_pass

126d0a0

test=develop

chengduoZH force-pushed the support_sparse_gradient branch from 464b882 to 126d0a0 Compare July 21, 2019 07:42

disable fuse_optimization

4458082

test=develop

chengduoZH commented Jul 22, 2019

View reviewed changes

gongweibao previously approved these changes Jul 22, 2019

View reviewed changes

fix fuse all reduce

be6afc5

test=develop

chengduoZH dismissed gongweibao’s stale review via be6afc5 July 22, 2019 12:15

gongweibao approved these changes Jul 23, 2019

View reviewed changes

Xreki approved these changes Jul 23, 2019

View reviewed changes

chengduoZH merged commit fd3aad6 into PaddlePaddle:develop Jul 23, 2019

Conversation

chengduoZH commented Jul 17, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

gongweibao left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

gongweibao left a comment

Choose a reason for hiding this comment

Uh oh!

gongweibao left a comment

Choose a reason for hiding this comment

Uh oh!

Xreki left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

chengduoZH commented Jul 17, 2019 •

edited

Loading