use static variable to do cache instead of thread local in thread frequent switching case by LeoZhao-Intel · Pull Request #18428 · PaddlePaddle/Paddle

LeoZhao-Intel · 2019-07-01T03:18:10Z

This case is special, NativeExecutor Predictor.run is called in a thread frequent switching case, for mkldnn, TransferScope cache creation is unavoidable. While current cache mechanism is using thread local variable to store cache, if execution thread is changed every iteration, it will always create new memory which makes big memory leak.

The solution is to use a global static variable to handle this case by explicitly indicating from user. Here I reuse SetMkldnnThreadid.

fix #18372 (comment)

…uent switching case to avoid memory leak test=develop

paddle/fluid/framework/transfer_scope_cache.cc

LeoZhao-Intel · 2019-07-01T03:40:36Z

Using static variable solution and API is to make it simple, otherwise if we use single globe structure, it makes more complicated, it need thread sync etc, and efficiency will drop.

test=develop

luotao1 · 2019-07-01T08:05:32Z

@LeoZhao-Intel, Discussed with @Superjomn :

Do you have some comments for "-1", or do you have some unit-test to ensure this "-1"?
Do you need add lock for static_transfer_scope_cache? When multi-instance use "-1" mode, it will fail on this.

LeoZhao-Intel · 2019-07-01T09:31:47Z

@LeoZhao-Intel, Discussed with @Superjomn :

Do you have some comments for "-1", or do you have some unit-test to ensure this "-1"?

-1 is just specific for mkldnn single instance but with thread frequent switching, so far we don't have unit-test.

Do you need add lock for static_transfer_scope_cache? When multi-instance use "-1" mode, it will fail on this.

In mkldnn single instance but with thread frequent switching, it only supports one instance, yes, it will fail if with multi-instances. We do have plan to refine this solution in the future.

jczaja · 2019-07-01T18:32:21Z

@luotao1 , @Superjomn Why GPU execution needs to store TransferScope in TLS, why each thread has to keep its own copy of scopes and data caches eg. what is wrong with having only one global data and scope cache?

luotao1 · 2019-07-02T03:42:56Z

Why GPU execution needs to store TransferScope in TLS? why each thread has to keep its own copy of scopes and data caches eg. what is wrong with having only one global data and scope cache?

@jczaja CUDADeviceContext is thread_local, i.e. one GPU-card have one CUDADeviceContext.

test=develop

paddle/fluid/framework/transfer_scope_cache.cc

LeoZhao-Intel · 2019-07-02T06:24:25Z

I guess, it is similar with mkldnn cache, to support multi-instance, thread_local is used to simply the implementation.

… mt_memoryleak

test=develop

paddle/fluid/inference/tests/api/analyzer_bert_tester.cc

test=develop

luotao1

LGTM

luotao1 · 2019-07-03T15:01:23Z

[14:29:15]	/home/teamcity/work/e84e6e698a3f913d/paddle/fluid/inference/tests/api/analyzer_bert_tester.cc:256:32: error: no member named 'set_cur_mkldnn_session_id' in namespace 'paddle::platform'
[14:29:15]	      if (is_static) platform::set_cur_mkldnn_session_id(1);
[14:29:15]	                     ~~~~~~~~~~^

In mac_ci

paddle/fluid/inference/tests/api/analyzer_bert_tester.cc

test=develop

Since Mac don't support MKL and MKLDNN now, we should skip `TEST(Analyzer_bert, static_transfer_scope_cache)` ON Mac test=develop

luotao1

LGTM

luotao1 · 2019-07-29T06:42:04Z

@LeoZhao-Intel Could you revert this PR since #18578 merge and there is no random fail CI on MKLDNN?

LeoZhao-Intel · 2019-07-29T06:48:28Z

@LeoZhao-Intel Could you revert this PR since #18578 merge and there is no random fail CI on MKLDNN?

ok, will revert it soon.

…read frequent switching case (PaddlePaddle#18428)" This reverts commit ce38bb5. test=develop

LeoZhao-Intel · 2019-07-29T08:23:31Z

see new PR #18879 for revert,

…read frequent switching case (#18428)" (#18879) This reverts commit ce38bb5. test=develop

use static variable to do cache instead of tread local in thread freq…

d91c910

…uent switching case to avoid memory leak test=develop

luotao1 reviewed Jul 1, 2019

View reviewed changes

paddle/fluid/framework/transfer_scope_cache.cc Outdated Show resolved Hide resolved

use marco to control code given it is specific for mkldnn

d6597b9

test=develop

luotao1 added the Intel label Jul 1, 2019

luotao1 requested a review from Superjomn July 1, 2019 06:04

refine code to support multi-instances

b4a82d7

test=develop

luotao1 reviewed Jul 2, 2019

View reviewed changes

paddle/fluid/framework/transfer_scope_cache.cc Outdated Show resolved Hide resolved

luotao1 reviewed Jul 2, 2019

View reviewed changes

paddle/fluid/framework/transfer_scope_cache.cc Outdated Show resolved Hide resolved

paddle/fluid/framework/transfer_scope_cache.cc Show resolved Hide resolved

luotao1 mentioned this pull request Jul 2, 2019

add transfer_scope_cache unit-test #18467

Merged

LeoZhao-Intel added 2 commits July 3, 2019 21:19

Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into…

6134460

… mt_memoryleak

align with new mkldnn session id api and const name

7dd035e

test=develop

luotao1 reviewed Jul 3, 2019

View reviewed changes

paddle/fluid/inference/tests/api/analyzer_bert_tester.cc Show resolved Hide resolved

paddle/fluid/inference/tests/api/analyzer_bert_tester.cc Show resolved Hide resolved

LeoZhao-Intel added 3 commits July 3, 2019 22:08

add ut for static_transfer_scope_Cache

c89c666

test=develop

correct parameter with test name

d00682f

test=develop

update comments

7d0b952

test=develop

luotao1 previously approved these changes Jul 3, 2019

View reviewed changes

luotao1 reviewed Jul 3, 2019

View reviewed changes

paddle/fluid/inference/tests/api/analyzer_bert_tester.cc Outdated Show resolved Hide resolved

LeoZhao-Intel dismissed luotao1’s stale review via cf3816d July 3, 2019 22:43

LeoZhao-Intel and others added 2 commits July 4, 2019 06:48

update code to pass mac_ci

cf3816d

test=develop

Add PADDLE_WITH_MKLDNN Macro to skip unit-test on MAC

b8bd5d0

Since Mac don't support MKL and MKLDNN now, we should skip `TEST(Analyzer_bert, static_transfer_scope_cache)` ON Mac test=develop

luotao1 mentioned this pull request Jul 4, 2019

[DO NOT MERGE] detect model test2 for dynamic shape #18372

Closed

luotao1 requested a review from sneaxiy July 5, 2019 05:36

LeoZhao-Intel mentioned this pull request Jul 5, 2019

clear cache when tid == -1 and cache size exceeds max capacity #18285

Closed

luotao1 approved these changes Jul 8, 2019

View reviewed changes

luotao1 merged commit ce38bb5 into PaddlePaddle:develop Jul 8, 2019

LeoZhao-Intel added a commit to LeoZhao-Intel/Paddle that referenced this pull request Jul 29, 2019

Revert "use static variable to do cache instead of thread local in th…

98e056a

…read frequent switching case (PaddlePaddle#18428)" This reverts commit ce38bb5. test=develop

LeoZhao-Intel mentioned this pull request Jul 29, 2019

Revert "use static variable to do cache instead of thread local in thread frequent switching case #18879

Merged

luotao1 pushed a commit that referenced this pull request Jul 30, 2019

Revert "use static variable to do cache instead of thread local in th…

10eeed9

…read frequent switching case (#18428)" (#18879) This reverts commit ce38bb5. test=develop

Conversation

LeoZhao-Intel commented Jul 1, 2019 • edited by luotao1 Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

LeoZhao-Intel commented Jul 1, 2019

Uh oh!

luotao1 commented Jul 1, 2019

Uh oh!

LeoZhao-Intel commented Jul 1, 2019

Uh oh!

jczaja commented Jul 1, 2019

Uh oh!

luotao1 commented Jul 2, 2019

Uh oh!

Uh oh!

Uh oh!

Uh oh!

LeoZhao-Intel commented Jul 2, 2019

Uh oh!

Uh oh!

Uh oh!

luotao1 left a comment

Choose a reason for hiding this comment

Uh oh!

luotao1 commented Jul 3, 2019

Uh oh!

Uh oh!

luotao1 left a comment

Choose a reason for hiding this comment

Uh oh!

luotao1 commented Jul 29, 2019

Uh oh!

LeoZhao-Intel commented Jul 29, 2019

Uh oh!

LeoZhao-Intel commented Jul 29, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

LeoZhao-Intel commented Jul 1, 2019 •

edited by luotao1

Loading