clear cache when tid == -1 and cache size exceeds max capacity#18285
clear cache when tid == -1 and cache size exceeds max capacity#18285LeoZhao-Intel wants to merge 11 commits intoPaddlePaddle:developfrom
Conversation
test=develop
2. Few fix in concat/pool mkldnn kernel for key generation 3. Enable cache clearing mechanism test=develop
|
@jczaja @jianhang-liu for code review |
test=develop
|
e.g. this line |
|
Could we use LRU for cache? |
|
@LeoZhao-Intel Ok, |
|
What you mean LRU? |
|
@LeoZhao-Intel Ok, So my understanding is that pointer has to remain valid , as it could be that cache clearing cleared stored data that this pointer is holding ? |
Correct! That's the idea to let shared_ptr keep memory till mkldnn pipeline execution done. |
|
LRU: Least recently used |
|
@luotao1 We had a long term plan to improve this cache clearing. Removing oldest entry is just first step. |
|
Could you create a new PR for |
|
Could we use a similar interface like
|
1. Enable cache clearing mechanism
platform::get_cur_thread_id() == -1 means it is in cache clearing mode.
In this mode, mkldnn key generation is plain format, without including real thread id, and when blob
size (mkldnn blob with first level key = -1, see line ) exceeds the defined max capacity (see line), it will trigger cache clearing, and remove one from head of this blob, the blob data structure is changed to vector type to meet requirement for removing from head.
2. Add new interface SetMKLDNNThreadId(int id) in AnalysisConfig
Use this interface to indicate that users want to set mkldnn thread id manually, original
AnalysisPredictor::SetMkldnnthreadid() API is not exposed to user directly. Meanwhile we use id=-1
to trigger cache clearing mode.
Given cache clearing mode is a specific mode to fix thread id frequent changing issue and dynamic
shape issue, it is rarely used, and should not be inherited by other AnalysisPredictor instances, we
need to set and clear value for each iteration, that means we need add hook points in
AnalysisPredictor::Run() and ZeroCopyRun().
3. Few fixes in mkldnn concat/pool/conv kernels
In these 3 kernels, due to key generation method is not aligned with new method (PR #17965), there
are few changes in key generation, and also fix potential crash issues if mkldnn cache doesn't work as
expected result (always cache successfully)
This part is merged by PR #18393
test=develop