cache runtime_context by luotao1 · Pull Request #16002 · PaddlePaddle/Paddle

luotao1 · 2019-03-01T09:31:49Z

RuntimeContext is used to relate input/output names of Operator with the corresponding variables in Scope. Since the input/output names of Operator do not change in the execution, RuntimeContext could be created only at the first iteration of the execution to save the elapsed time.

In the inference of PyramidDNN (a small model):

	2450 v2 (ms per sample)	2620 v3 (ms per sample)
before	0.305032	0.228627
after	0.239653	0.190147
speedup	21%	16%

test=develop

panyx0718 · 2019-03-05T10:57:58Z

paddle/fluid/framework/scope.h


+  /// Find whether a variable in the current scope.
+  /// Return false if cannot find.
+  bool HasLocalVar(const std::string& name) const;


where is this used?

I want to use scope.parent()->HasLocalVar(kLocalExecScopeName) to replace scope->FindVar(kLocalExecScopeName) in this PR, but it seems that speedup is not obvious. I will remove it later.

remove the used HasLocalVar function.

panyx0718 · 2019-03-05T10:59:03Z

paddle/fluid/framework/operator.cc

+  // in the execution, RuntimeContext could be created only at the first
+  // iteration of the execution to save the elapsed time.
+  // Note that the Scope should not be the local scope, since local scope
+  // would be cleaned regularly.


making this default is dangerous because there is no restriction that global scope is not allowed to change.

ca34c90 add runtime_context_cache_pass to do it, and making this default false in analysis_config of inference . @panyx0718 @Superjomn

luotao1 · 2019-03-13T09:41:05Z

paddle/fluid/inference/tests/api/config_printer.h

There are duplicated print enable_ir_optim before, see line74 and 75.

test=develop

luotao1 · 2019-03-15T07:53:29Z

Fails on TensorRT, the reason is that there are two more runtime_context_cache_pass at the last. The reason is the same with #16175. Thus, this PR will set enable_runtime_context_cache_ false in default, and after fix #16175, enable_runtime_context_cache_ will be true in default.
http://ci.paddlepaddle.org/viewLog.html?buildId=70592&tab=buildLog&buildTypeId=Paddle_PrCi&logTab=tree&filter=all&_focus=18755

[06:53:00]	I0315 06:52:48.986371 71768 analysis_predictor.cc:367] TensorRT subgraph engine is enabled
[06:53:00]	--- Running analysis [ir_graph_build_pass]
[06:53:00]	--- Running analysis [ir_analysis_pass]
[06:53:00]	--- Running IR pass [infer_clean_graph_pass]
[06:53:00]	--- Running IR pass [identity_scale_op_clean_pass]
[06:53:00]	--- Running IR pass [tensorrt_subgraph_pass]
[06:53:00]	---  detect a sub-graph with 305 nodes
[06:53:00]	--- Running IR pass [conv_affine_channel_fuse_pass]
[06:53:00]	--- Running IR pass [conv_eltwiseadd_affine_channel_fuse_pass]
[06:53:00]	--- Running IR pass [conv_elementwise_add_act_fuse_pass]
[06:53:00]	--- Running IR pass [conv_elementwise_add2_act_fuse_pass]
[06:53:00]	--- Running IR pass [conv_elementwise_add_fuse_pass]
[06:53:00]	--- Running IR pass [transpose_flatten6_concat_fuse_pass]
[06:53:00]	--- Running IR pass [transpose_flatten5_concat_fuse_pass]
[06:53:00]	--- Running IR pass [transpose_flatten4_concat_fuse_pass]
[06:53:00]	--- Running IR pass [transpose_flatten3_concat_fuse_pass]
[06:53:00]	--- Running IR pass [runtime_context_cache_pass]
[06:53:00]	--- Running IR pass [runtime_context_cache_pass]
[06:53:00]	--- Running analysis [ir_params_sync_among_devices_pass]
[06:53:00]	I0315 06:52:49.501626 71768 ir_params_sync_among_devices_pass.cc:41] Sync params from CPU to GPU

test=develop

luotao1 · 2019-03-15T13:56:30Z

[12:10:22]	[Step 1/1] + APPROVALS=FALSE
[12:10:22]	[Step 1/1] + echo 'current pr 16002 got approvals: FALSE'
[12:10:22]	[Step 1/1] + '[' FALSE == FALSE ']'
[12:10:22]	[Step 1/1] + '[' paddle/fluid/framework/operator.h == paddle/fluid/API.spec ']'
[12:10:22]	[Step 1/1] + echo 'You must have panyx0718 approval for the api change! paddle/fluid/framework/operator.h'
[12:10:22]	[Step 1/1] + exit 1
[12:10:22]	[Step 1/1] current pr 16002 got approvals: FALSE
[12:10:22]	[Step 1/1] You must have panyx0718 approval for the api change! paddle/fluid/framework/operator.h

luotao1 · 2019-03-15T14:23:19Z

@panyx0718 Could start review again since the latest commit passes all the CI except api change?

panyx0718 · 2019-03-18T01:49:38Z

paddle/fluid/framework/ir/runtime_context_cache_pass.cc

+  VLOG(3) << "Applies Runtime Context Cache strategy.";
+  for (const Node* n : graph->Nodes()) {
+    if (n->IsOp()) {
+      n->Op()->SetAttr(kEnableRuntimeContext, true);


kEnableCacheRuntimeContext?

panyx0718 · 2019-03-18T01:51:16Z

paddle/fluid/framework/operator.h


+/// RuntimeContext is used to relate input/output names of Operator with
+/// the corresponding variables in Scope.
+/// If an Op has attribute kEnableRuntimeContext, it means that in a same Scope,


in a name scope?

Superjomn · 2019-03-18T02:02:03Z

Make the contexts member of the operator is much simpler, no need to make them attributes or the operators will have too many attributes, with the runtime attributes mixed with the algorithms'.

Currently, the operator interface is over-implemented/designed, for inference, it should be much simpler, use MACRO to wrap them with the members?

Do we really need to change the runtime context or infer shape context? suffer that complexity?

luotao1 · 2019-03-18T02:36:20Z

Make the contexts member of the operator is much simpler

I try this at first, but fail on ParallelExecutor unit-tests and distirbuted async unit-tests.

In ParallelExecutor, since it creates local scope and the local scope would be cleaned regularly, we should use scope.FindVar(details::kLocalExecScopeName) to detect whether it is a local scope.
784826a
After fix ParallelExecutor unit-tests, I try to fix distirbuted async unit-tests. However, if context is the member of the operator, it will fails random on distirbuted async unit-tests. Discussed with @typhoonzero , distirbuted async unit-tests needs to create the runtime_context in a thread.

for inference, it should be much simpler, use MACRO to wrap them with the members

Do you mean use #ifdef ON_INFER with the contexts member?
I try this as well, but fail on Tensorrt unit-tests. The error is "Attribute 'subgraph' is required! at [/paddle/paddle/fluid/framework/attribute.h:276].
Do you mean use #ifdef ON_INFER and #ifndef PADDLE_WITH_TENSORRT? But it will not gain any speedup in an inference library with tensorrt.

Do we really need to change the runtime context or infer shape context? suffer that complexity?

The reason of changing runtime context is that cache it has a obvious speed up on small model of inference. Besides, we will cache kernel choose in #16004

@panyx0718 @Superjomn How do you think about using runtime_context_cache_pass or MACRO #ifdef ON_INFER and #ifndef PADDLE_WITH_TENSORRT?

test=develop

luotao1 added 4 commits March 1, 2019 18:03

cache runtime_context

9773f38

test=develop

fix cpplint error

82b0bb9

test=develop

Merge branch 'develop' into runtime_context

2fb38c1

enhance cache runtime_context for different scope

784826a

test=develop

luotao1 requested a review from chengduoZH March 5, 2019 02:01

chengduoZH requested a review from panyx0718 March 5, 2019 02:06

try to fix distributed unit-test

c0b240a

test=develop

panyx0718 reviewed Mar 5, 2019

View reviewed changes

Merge branch 'develop' into runtime_context

b561ad1

luotao1 commented Mar 13, 2019

View reviewed changes

add runtime_context_cache_pass

d94fd97

test=develop

luotao1 requested a review from Superjomn March 13, 2019 09:59

luotao1 added 2 commits March 13, 2019 21:00

turn off runtime_context_cache for tensorrt

1510b86

test=develop

Merge branch 'develop' into runtime_context

b2898c0

test=develop

luotao1 mentioned this pull request Mar 14, 2019

Add cpu_quantize_pass for C-API quantization #16127

Merged

luotao1 added 3 commits March 15, 2019 12:32

Merge branch 'develop' into runtime_context

6ce25c9

Merge branch 'develop' into runtime_context

1b59bed

fix distributed unit-tests

46ee6bb

test=develop

set enable_runtime_context_cache_ default false

5ecdc49

test=develop

panyx0718 reviewed Mar 18, 2019

View reviewed changes

luotao1 added 2 commits March 18, 2019 12:45

Merge branch 'develop' into runtime_context

a275fd6

refine with comments

cc0ae1f

test=develop

panyx0718 approved these changes Mar 18, 2019

View reviewed changes

luotao1 merged commit dbb92ee into PaddlePaddle:develop Mar 19, 2019

luotao1 deleted the runtime_context branch March 19, 2019 01:31

luotao1 mentioned this pull request Mar 19, 2019

Revert "cache runtime_context" #16287

Merged

luotao1 restored the runtime_context branch March 19, 2019 14:01

luotao1 mentioned this pull request Mar 19, 2019

add runtime_context_cache_pass #16301

Merged

luotao1 deleted the runtime_context branch May 10, 2019 09:28

Conversation

luotao1 commented Mar 1, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

luotao1 Mar 5, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

luotao1 Mar 13, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

luotao1 commented Mar 15, 2019

Uh oh!

luotao1 commented Mar 15, 2019

Uh oh!

luotao1 commented Mar 15, 2019

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Superjomn commented Mar 18, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

luotao1 commented Mar 18, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

luotao1 commented Mar 1, 2019 •

edited

Loading

luotao1 Mar 5, 2019 •

edited

Loading

luotao1 Mar 13, 2019 •

edited

Loading

Superjomn commented Mar 18, 2019 •

edited

Loading

luotao1 commented Mar 18, 2019 •

edited

Loading