Code Synchronization #3

esythan · 2021-09-13T08:56:49Z

PR types

PR changes

Describe

* fix count_api_without_core_ops, test=develop * fix count_api_without_core_ops, test=develop * refine, test=develop * remove test code, test=develop * remove test, test=develop * modify check_api_approvals.sh, test=develop

…5196) * add CPUDeiveEvent * Polish DeviceEvent code * Add DEVICE_EVENT_LIBS

) * Add calculation for gru op * Correct the types * Remove mkldnn only * Correct mkldnn ifdef * Remove mkldnn ifdef * Separate mkldnn quantizer test * Correct Windows test * Check different cmake fix * Revert cmake change * Cmake change 2 * Cmake change 3

* sparse_momentum_op is used to save w@GRAD memory for gather_op when gather from a large parameter

* add maxunppol2d op, test=develop * fix typo, test=develop * fix unpool unitest, test=develop * fix unpool code-example, test=develop * fix for unpool_op_unittest,test=develop * fix example code, test=develop * add noqa:F401, test=develop * fix converage, test=develop * fix unitest for unpool, test=develop * rename unpool2d to unpool, test=develop * rename unpool2d to unpool, test=develop

* Abstract GenerateDeviceEventFlag to shield platforms * Remove get_cuda_flags

* notest;test=cpu_gpu * notest;test=cpu_gpu * notest;test=cpu_gpu * notest;test=cpu_gpu * notest;test=cpu_gpu * notest;test=cpu_gpu * notest;test=cpu_gpu * fix * fix

* tmp * Tile - Assign - Crop * Finish the set value npu kernel and test case in npu * improve the error message * Modify according to zhangliujie * code review

* update ernie int8

…35613)

…" (#35650) This reverts commit ae93d9c.

* add group_norm trt converter test case * update group_norm trt converter test case

…35556) * support grad group * fix single card condition

* fix github name * fix CI error * fix review and CI error * fix inf,nan error and modify unittest samples * add unittest samples * add unittest samples * fix unittest error * test=document_fix * test=document_fix * modify doc and add unittest samples * fix error newline in constant * modify doc after mentor review * modify __all__ and doc * modify doc

* upload global scatter and global gather operators related files

* reshape support zero-input * add unitest * revise error message

* add flatten/flatten2 converter test cases * add fatten/flatten2 trt converter test cases

* [RC22] Fix linear with matmul_op replace * [RC22] Fix linear with matmul_op replace * [RC22] Fix linear with matmul_op replace * [RC22] Fix linear with matmul_op replace * [RC22] Fix linear with matmul_op replace

* fix cumprod docs * fix cumprod op docs; test=document_fix

…#35635)

…and remove some old codes for broadcast. (#35487)

Fix CPU CI build-time

* fix instance norm index error * add unittest * update * fix

* fix interpolate launch error, test=develop * fix area mode for interp, test=develop

…functor and remove some old codes for broadcast. (#35487)" (#35686)

change prototxt path for testing

* update fft api path (PaddlePaddle#36219) * update fft api path * add sample code for ihfft2 Co-authored-by: chenfeiyu <[email protected]> * fix fft axis (PaddlePaddle#36321) fix: `-1` is used when fft's axis is `0` * use unified external error message for cufft api (PaddlePaddle#36114) * fft: modify sample code result (PaddlePaddle#36325) * dynamic load mkl as a fft backend when it is avaialble and requested (PaddlePaddle#36414) * add rocm support for fft api (PaddlePaddle#36415) * move signal apis * move fft and signal API path (#2) * move signal apis * move fft.py and signal.py to paddle/, fix typos * fix relative imports from fft.py and signal.py * fix typos in signal.py (#3) * move signal apis * move fft.py and signal.py to paddle/, fix typos * fix relative imports from fft.py and signal.py * fix typos * disable Cache when CUFFT_VERSION >= 10200 (#4) * move signal apis * move fft.py and signal.py to paddle/, fix typos * fix relative imports from fft.py and signal.py * fix typos * Add LRUCache for fft plans * add LRUCache for cuff and hipfft (#5) * move signal apis * move fft.py and signal.py to paddle/, fix typos * fix relative imports from fft.py and signal.py * fix typos * WIP: add cache * delete move constructor and operator= for CuFFTHandle and FFTConfig * remove log from CuFFTHandle and FFTConfig * add lrucache for fft rocm backend * disable LRUCache when CUFFT_VERSION >= 10200 * disbale copy and move for hipFFTHandle; format code Co-authored-by: Xiaoxu Chen <[email protected]> * remove debug message of cufftHandler * roll_op: support Tensor as input for shifts (PaddlePaddle#36727) * fix fftshift/ifftshift on static mode * update roll_op version * add more test cases for fftshift/ifftshift Co-authored-by: zhiboniu <[email protected]> Co-authored-by: chenfeiyu <[email protected]> Co-authored-by: LJQ❤️ <[email protected]>

…ten::DenseTensor, test=allcases (PaddlePaddle#38473) * Added shared_ptr<Allocation> member & corresponding interfaces to Storage * Removed original pten::Allocation from Storage and adjusted the interfaces accordingly * Fixed issues with storage offset * Used place to malloc allocation for TensorStorage * [Unify Tensors PR #3]Ported framework::Tensor interfaces to pten::DenseTensor * Fixed issues with place * Added comments * Moved mutable_data with stream argument to DenseTensor * Added set_offset interface * Fixed CI issues,test=allcases * [Unify Tensors PR #4] Port LoDTensor interfaces to DenseTensor * Reverted changes too pten_layout() interface * Removed friend classes

…t=allcases (PaddlePaddle#38632) * Added shared_ptr<Allocation> member & corresponding interfaces to Storage * Removed original pten::Allocation from Storage and adjusted the interfaces accordingly * Fixed issues with storage offset * Used place to malloc allocation for TensorStorage * [Unify Tensors PR #3]Ported framework::Tensor interfaces to pten::DenseTensor * Fixed issues with place * Added comments * Moved mutable_data with stream argument to DenseTensor * Added set_offset interface * Fixed CI issues,test=allcases * [Unify Tensors PR #4] Port LoDTensor interfaces to DenseTensor * Removed friend class EigenTensor/EigenMatrix/EigenVector from Tensor * Modified framework::Tensor to inherit from DenseTensor * Reverted changes too pten_layout() interface * Removed friend classes * Rearranged cfunction calls from tensor.data<void>() to tensor.data() * Fixed CI issues * Fixed lite issues * Fixed data() interface issues,test=allcases * Resolved IsInitialized() issues * Fixed ResetHolder() issues * Fixed MKLDNN & Storage issues * Resolved ShareBufferWith() issues * Fixed LoD issues

…st=allcases (PaddlePaddle#38811) * Added shared_ptr<Allocation> member & corresponding interfaces to Storage * Removed original pten::Allocation from Storage and adjusted the interfaces accordingly * Fixed issues with storage offset * Used place to malloc allocation for TensorStorage * [Unify Tensors PR #3]Ported framework::Tensor interfaces to pten::DenseTensor * Fixed issues with place * Added comments * Moved mutable_data with stream argument to DenseTensor * Added set_offset interface * Fixed CI issues,test=allcases * [Unify Tensors PR #4] Port LoDTensor interfaces to DenseTensor * Removed friend class EigenTensor/EigenMatrix/EigenVector from Tensor * Modified framework::Tensor to inherit from DenseTensor * Reverted changes too pten_layout() interface * Removed friend classes * Rearranged cfunction calls from tensor.data<void>() to tensor.data() * Fixed CI issues * Fixed lite issues * Fixed data() interface issues,test=allcases * Resolved IsInitialized() issues * Fixed ResetHolder() issues * Fixed MKLDNN & Storage issues * Resolved ShareBufferWith() issues * Fixed LoD issues * Removed interfaces & members from lod_tensor,test=allcases

PaddlePaddle#39128) * Added selected_rows and rw_lock to pten * Renamed the unit test target to fix CI * Removed Class SelectedRows in Fluid, changed include/cmake relationship, use pten::SelectedRows in Fluid * Remove rw_lock.h,rw_lock_test.cc in fluid * Use pten::RWLock and pten::AutoRDLock, fix CI * Use pten::SelectedRows * Use pten::SelectedRows * Fix to pass NPU CI * Use pten::SelectedRows, to pass NPU CI * To fix NPU CI * To fix NPU CI again

…Paddle#41051) * [Refactor] refactored eager_gen.py PR #2 * [DoubleGrad PR #1] Decoupled code generation logics for Dygraph ForwardFunctions and GradNodes * Fixed minor issue * Adjusted logics of GenerateNodeCreationCodes and GenerateForwardDefinition * Fixed issues * Supported higher-order grad node generation * [DoubleGrad PR #4] Supported higher-order GradNode generation * Fixed yaml typo

HydrogenSulfate and others added 30 commits August 27, 2021 14:02

Update test_cross_entropy_loss.py

d1a1105

Update test_cross_entropy_loss.py

23cc214

Update test_cross_entropy_loss.py

0bf3248

Update loss.py

f2df33e

Update test_cross_entropy_loss.py

f6dc4b6

Update test_cross_entropy_loss.py

7afd7f3

Update test_cross_entropy_loss.py

ee070fb

Update loss.py

0046768

Update loss.py

52804cd

Update loss.py

0c2d6bc

Update loss.py

11e9d4e

Update loss.py

cf6e543

Update test_cross_entropy_loss.py

e838cac

fix the crash when input variable is bool type, test=develop (#35176)

ad52248

gelu/logsigmoid add AsExtra (#35198)

2006fbc

fix count_api_without_core_ops (#35170)

7272526

* fix count_api_without_core_ops, test=develop * fix count_api_without_core_ops, test=develop * refine, test=develop * remove test code, test=develop * remove test, test=develop * modify check_api_approvals.sh, test=develop

Polish DeviceEvent interface and Remove #ifdef in InterpreterCore (#3…

48bf7cb

…5196) * add CPUDeiveEvent * Polish DeviceEvent code * Add DEVICE_EVENT_LIBS

[hybrid] Fix row parallel linear bias (#35186)

1533d7e

sparse_momentum_op is used to save w@GRAD memory for gather_op (#34942)

234ce93

* sparse_momentum_op is used to save w@GRAD memory for gather_op when gather from a large parameter

add more models for model_benchmark_ci,test=document_fix (#35178)

5a72cf4

add uniform_ op and UT (#33934)

be29b8e

test=document_fix (#35222)

5dcff7c

test=document_fix (#35221)

31cd106

Abstract GenerateDeviceEventFlag to shield platforms (#35219)

20cfa8b

* Abstract GenerateDeviceEventFlag to shield platforms * Remove get_cuda_flags

Add cpu/gpu for PR-CI-CPU-Py2 (#35174)

8f94d34

* notest;test=cpu_gpu * notest;test=cpu_gpu * notest;test=cpu_gpu * notest;test=cpu_gpu * notest;test=cpu_gpu * notest;test=cpu_gpu * notest;test=cpu_gpu * fix * fix

Set value (#34886)

37d281c

* tmp * Tile - Assign - Crop * Finish the set value npu kernel and test case in npu * improve the error message * Modify according to zhangliujie * code review

del message;test=document_fix (#35248)

c0bdef5

[paddle-TRT]support matmul set to int8 in multihead (#34917)

0043fa8

* update ernie int8

zoooo0820 and others added 22 commits September 13, 2021 10:52

catch dimentions error when input is empty in static.nn.group_norm (#…

7b743ba

…35613)

Revert "change '/' method from scale Op to elementwise_div Op (#33279)…

03026ce

…" (#35650) This reverts commit ae93d9c.

add group_norm trt converter test case (#35524)

787209f

* add group_norm trt converter test case * update group_norm trt converter test case

Add public api for dlpack. (#35620)

48ec02f

[HybridParallel]Fix scaler bug in pipeline_parallel/model_parallel (#…

2bb4431

…35556) * support grad group * fix single card condition

[NPU] add npu unit test if title has NPU key word, test=develop (#35566)

666da14

add gather_nd trt converter test cases (#35464)

42559f7

add gather trt converter test case (#35523)

75d5e3b

Support int16_t in fill_constant_op (#35619)

4b6f809

upload global scatter and global gather operators related files (#35546)

ecfe837

* upload global scatter and global gather operators related files

[Bugfix] reshape with zero input tensor (#35642)

cabc5f3

* reshape support zero-input * add unitest * revise error message

add flatten/flatten2 converter test cases (#35462)

fb65268

* add flatten/flatten2 converter test cases * add fatten/flatten2 trt converter test cases

[RC22] Fix linear with matmul_op replace (#35445)

53e294c

* [RC22] Fix linear with matmul_op replace * [RC22] Fix linear with matmul_op replace * [RC22] Fix linear with matmul_op replace * [RC22] Fix linear with matmul_op replace * [RC22] Fix linear with matmul_op replace

fix cumprod docs (#35647)

1a7b3ff

* fix cumprod docs * fix cumprod op docs; test=document_fix

add xpu_wait & new implementation replace memcpy in adam, adamw (#35437)

86a6be1

refine svd; unexpose tensor.svd; fix english document; set timeout=40 (…

f521a30

…#35635)

Implement FunctionTraits to support two kinds of elementwise functor …

d4f84d4

…and remove some old codes for broadcast. (#35487)

Fix CPU CI build-time count (#35677)

0460608

Fix CPU CI build-time

fix instance norm index error (#35341)

e641c63

* fix instance norm index error * add unittest * update * fix

fix interpolate launch error (#35577)

5f31737

* fix interpolate launch error, test=develop * fix area mode for interp, test=develop

Revert "Implement FunctionTraits to support two kinds of elementwise …

40d4a29

…functor and remove some old codes for broadcast. (#35487)" (#35686)

esythan merged commit a64efe0 into esythan:develop Sep 13, 2021

esythan pushed a commit that referenced this pull request Sep 30, 2021

Merge pull request #3 from seemingwang/accessor_merge

b7b8e7c

change prototxt path for testing

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Code Synchronization #3

Code Synchronization #3

Uh oh!

esythan commented Sep 13, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

96 participants

Code Synchronization #3

Code Synchronization #3

Uh oh!

Conversation

esythan commented Sep 13, 2021

PR types

PR changes

Describe

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

96 participants