Skip to content

Feature/anakin engine#5

Merged
Shixiaowei02 merged 3 commits intoShixiaowei02:feature/anakin-enginefrom
PaddlePaddle:feature/anakin-engine
Mar 15, 2019
Merged

Feature/anakin engine#5
Shixiaowei02 merged 3 commits intoShixiaowei02:feature/anakin-enginefrom
PaddlePaddle:feature/anakin-engine

Conversation

@Shixiaowei02
Copy link
Owner

No description provided.

@Shixiaowei02 Shixiaowei02 merged this pull request into Shixiaowei02:feature/anakin-engine Mar 15, 2019
Shixiaowei02 pushed a commit that referenced this pull request Feb 12, 2020
…1223)

* Add the first implememtation of fusion_group op PaddlePaddle#19621 (#3)

* Add the dynamic load of nvrtc, and support runtime compiling of CUDA kernel using nvrtc.
test=develop

* Call CUDA driver api to launch the kernel compiled by nvrtc.
test=develop

* Disable for mac and windows.
test=develop

* Refine the codes to support manually specified num_threads and workload_per_thread.
test=develop

* Refine the CUDA kernel to support large dims.
test=develop

* Add DeviceCodePool to manage all device codes.

* Add the first implementation fusion_group op.

* Add unit-test for fusion_group op.

* Add the check of result.

* Add the check of nvrtc in unit-test.
test=develop

* Add comment to explain the inputs, outputs and features of fusion_group op.
test=develop

* Disable fusion_group op for mac and windows.
test=develop

* Make the compiling of device code return status instead of hanging up.
test=develop

* Add the check of whether there is CUDA driver library, and do not core dump when failing to call the CUDA driver API.

* Unify fusion_group_op's input and output names.
test=develop

* Add the check of CUDA driver library in unittest.
test=develop

* Enable generating code for a given subgraph. PaddlePaddle#21126 (#4)

* Enable generating code for a given subgraph.

* Support sorting the subgraph.

* Remove the rearange of expressions because we use the sorted subgraph directly.

* Enable generating code for a subgraph which is composed of grad ops.

* Use expression information to check the accuracy in unittest.

* Separate load and store from computation expressions.
test=develop

* Improve the loading statements in generated codes.
test=develop

* Remove unused arguments from formal list.
test=develop

* Enable the detection of subgraph of grad ops.

* Generate code for detected subgraph in fusion_group_pass.

* Add an option in BuildStrategy to enable fusion_group_pass and add unittest.
test=develop

* Fix a bug when checking whether the shape of all inputs are the same.

* Add debug information.

* Remove subgraph_detector from inference/analysis to the common framework/ir directory. (#5)

test=develop

* Call subgraph_detector in fusion_group pass.
test=develop

* Disable fusion_group when WITH_GPU is OFF.
test=develop

* Refine all PADDLE_ENFORCE message.
test=develop

* Fix the case that some inputs are not defined in grad ops, and set op_role for fused op.
test=develop

* Follow review comments.
test=develop
Shixiaowei02 pushed a commit that referenced this pull request Oct 12, 2021
* Add elementwise_sub_mkldnn_op without grad

* Add test to static_mode_white_list

* Refactor code, change license years

* Remove invalid grad implementation

* Fix element_wise_sub_op test

* Fix CI Approval error

* Remove unnecessary EltwiseSubMKLDNNGradKernel class

* Fix CI Approval 2

* Fix CI Approval 3

* Fix CI Approval Attempt #4

* Fix CI Approve Attempt #5

* Fix CI Approval Attempt #6

* Fix CI Approval Attemt #7

* Change test names containing add to sub

* Fix old tests testing add instead of sub

* Copy grad implementation from elementwise_add_mkldnn

* CI test fix attempt

* Revert "CI test fix attempt"

This reverts commit c647cacf41e6a87c715385a185de5cbf65fc8900.

* Fix CI attempt 2

* Fix elementwise_sub tests, temporary mkldnn broadcast test disable

* Add working implementation of elementwise_sub grad

* Fix build errors caused by pull

* Fix format error

* Fix format error 2

* Disable elementwise_sub_mkldnn test on GPU

* Apply fix for paddle.fluid import

* Revert changes of test_elementwise_sub and Fix mkldnn test

* Revert "Apply fix for paddle.fluid import"

This reverts commit fc3b122.

* fix bug of module 'paddle' has no attribute 'fluid' for python3.6 (PaddlePaddle#35862)

* Add changes suggested by reviewers

* Change @unittest.skipIf... to @OpTestTool.skip_if_not_cpu_bf16() to satisfy Approval CI

* Remove check_dygraph=False to satisify CI Approval

Co-authored-by: zhangbo9674 <82555433+zhangbo9674@users.noreply.github.com>
Shixiaowei02 pushed a commit that referenced this pull request Nov 15, 2021
* update fft api path (PaddlePaddle#36219)

* update fft api path
* add sample code for ihfft2

Co-authored-by: chenfeiyu <chenfeiyu@baidu.com>

* fix fft axis (PaddlePaddle#36321)

fix: `-1` is used when fft's axis is `0`

* use unified external error message for cufft api (PaddlePaddle#36114)

* fft: modify sample code result (PaddlePaddle#36325)

* dynamic load mkl as a fft backend when it is avaialble and requested (PaddlePaddle#36414)

* add rocm support for fft api (PaddlePaddle#36415)

* move signal apis

* move fft and signal API path (#2)

* move signal apis

* move fft.py and signal.py to paddle/, fix typos

* fix relative imports from fft.py and signal.py

* fix typos in signal.py (#3)

* move signal apis

* move fft.py and signal.py to paddle/, fix typos

* fix relative imports from fft.py and signal.py

* fix typos

* disable Cache when CUFFT_VERSION >= 10200 (#4)

* move signal apis

* move fft.py and signal.py to paddle/, fix typos

* fix relative imports from fft.py and signal.py

* fix typos

* Add LRUCache for fft plans

* add LRUCache for cuff and hipfft (#5)

* move signal apis

* move fft.py and signal.py to paddle/, fix typos

* fix relative imports from fft.py and signal.py

* fix typos

* WIP: add cache

* delete move constructor and operator= for CuFFTHandle and FFTConfig

* remove log from CuFFTHandle and FFTConfig

* add lrucache for fft rocm backend

* disable LRUCache when CUFFT_VERSION >= 10200

* disbale copy and move for hipFFTHandle; format code

Co-authored-by: Xiaoxu Chen <chenxx_id@163.com>

* remove debug message of cufftHandler

* roll_op: support Tensor as input for shifts (PaddlePaddle#36727)

* fix fftshift/ifftshift on static mode

* update roll_op version

* add more test cases for fftshift/ifftshift

Co-authored-by: zhiboniu <31800336+zhiboniu@users.noreply.github.com>
Co-authored-by: chenfeiyu <chenfeiyu@baidu.com>
Co-authored-by: LJQ❤️ <33169170+lijiaqi0612@users.noreply.github.com>
Shixiaowei02 pushed a commit that referenced this pull request Jan 11, 2022
…t=allcases (PaddlePaddle#38632)

* Added shared_ptr<Allocation> member & corresponding interfaces to Storage

* Removed original pten::Allocation from Storage and adjusted the interfaces accordingly

* Fixed issues with storage offset

* Used place to malloc allocation for TensorStorage

* [Unify Tensors PR #3]Ported framework::Tensor interfaces to pten::DenseTensor

* Fixed issues with place

* Added comments

* Moved mutable_data with stream argument to DenseTensor

* Added set_offset interface

* Fixed CI issues,test=allcases

* [Unify Tensors PR #4] Port LoDTensor interfaces to DenseTensor

* Removed friend class EigenTensor/EigenMatrix/EigenVector from Tensor

* Modified framework::Tensor to inherit from DenseTensor

* Reverted changes too pten_layout() interface

* Removed friend classes

* Rearranged cfunction calls from tensor.data<void>() to tensor.data()

* Fixed CI issues

* Fixed lite issues

* Fixed data() interface issues,test=allcases

* Resolved IsInitialized() issues

* Fixed ResetHolder() issues

* Fixed MKLDNN & Storage issues

* Resolved ShareBufferWith() issues

* Fixed LoD issues
Shixiaowei02 added a commit that referenced this pull request Jan 24, 2022
…lePaddle#39085)

* updates callers, test=develop

* updates tensor, test=develop

* fixes errors, test=develop

* remove some dtypes, test=develop

* fix errors in the base storage modification, test=develop

* fixes a bug, test=develop

* fixes the bugs in push the whole, test=develop

* updates, test=develop

* update

* update, test=develop

* fixes the mac-py3 CI, test=develop

* remove the storage impl, test=develop

* updates some codes, test=develop

* update, test=develop

* updates pten allocation, test=develop
Shixiaowei02 added a commit that referenced this pull request Jan 24, 2022
…lePaddle#39085)

* updates callers, test=develop

* updates tensor, test=develop

* fixes errors, test=develop

* remove some dtypes, test=develop

* fix errors in the base storage modification, test=develop

* fixes a bug, test=develop

* fixes the bugs in push the whole, test=develop

* updates, test=develop

* update

* update, test=develop

* fixes the mac-py3 CI, test=develop

* remove the storage impl, test=develop

* updates some codes, test=develop

* update, test=develop

* updates pten allocation, test=develop
Shixiaowei02 pushed a commit that referenced this pull request Jan 27, 2022
…ddle#39236)

* Added selected_rows and rw_lock to pten

* Renamed the unit test target to fix CI

* Removed Class SelectedRows in Fluid, changed include/cmake relationship, use pten::SelectedRows in Fluid

* Remove rw_lock.h,rw_lock_test.cc in fluid

* Use pten::RWLock and pten::AutoRDLock, fix CI

* Use pten::SelectedRows

* Use pten::SelectedRows

* Fix to pass NPU CI

* Selected_Rows inherits from TensorBase

* Use pten::SelectedRows, to pass NPU CI

* To fix NPU CI

* To fix NPU CI again

* Use paddle/pten/core/enforce and polish code

* Use pten::DataType instead of using proto_type

* Move part of data_type to pten

* Polish Code
Shixiaowei02 pushed a commit that referenced this pull request Apr 3, 2022
…sed to paddle.grad() (PaddlePaddle#41198)

* [Refactor] refactored eager_gen.py PR #2

* [DoubleGrad PR #1] Decoupled code generation logics for Dygraph ForwardFunctions and GradNodes

* Fixed minor issue

* Adjusted logics of GenerateNodeCreationCodes and GenerateForwardDefinition

* Fixed issues

* Supported higher-order grad node generation

* [DoubleGrad PR #4] Supported higher-order GradNode generation

* [DoubleGrad #4] Bug Fixes to Double Grad Node Generation

* Fixed yaml typo

* Fixed yaml typo

* fixed minor issues

* [DoubleGrad PR #5] Enabled gradient computations for grad_tensors passed to paddle.grad()

* Fixed minor issue

* Fixed CI-Inference issue

* Fixed CI-inference issues
Shixiaowei02 pushed a commit that referenced this pull request Apr 9, 2022
…rd run (PaddlePaddle#41306)

* [Refactor] refactored eager_gen.py PR #2

* [DoubleGrad PR #1] Decoupled code generation logics for Dygraph ForwardFunctions and GradNodes

* Fixed minor issue

* Adjusted logics of GenerateNodeCreationCodes and GenerateForwardDefinition

* Fixed issues

* Supported higher-order grad node generation

* [DoubleGrad PR #4] Supported higher-order GradNode generation

* [DoubleGrad #4] Bug Fixes to Double Grad Node Generation

* Fixed yaml typo

* Fixed yaml typo

* fixed minor issues

* [DoubleGrad PR #5] Enabled gradient computations for grad_tensors passed to paddle.grad()

* Fixed minor issue

* Fixed CI-Inference issue

* Fixed CI-inference issues

* [DoubleGrad PR #7] paddle.grad() to copy backward graph before backward run

* Fixed minor issues

* Fixed issue with backward graph construction logic

* Fixed implementation issues with backward graph reconstruction

* Fixed unittest issue

* Fixed issues
Shixiaowei02 pushed a commit that referenced this pull request Apr 9, 2022
…ePaddle#41387)

* [Refactor] refactored eager_gen.py PR #2

* [DoubleGrad PR #1] Decoupled code generation logics for Dygraph ForwardFunctions and GradNodes

* Fixed minor issue

* Adjusted logics of GenerateNodeCreationCodes and GenerateForwardDefinition

* Fixed issues

* Supported higher-order grad node generation

* [DoubleGrad PR #4] Supported higher-order GradNode generation

* [DoubleGrad #4] Bug Fixes to Double Grad Node Generation

* Fixed yaml typo

* Fixed yaml typo

* fixed minor issues

* [DoubleGrad PR #5] Enabled gradient computations for grad_tensors passed to paddle.grad()

* Fixed minor issue

* Fixed CI-Inference issue

* Fixed CI-inference issues

* [DoubleGrad PR #7] paddle.grad() to copy backward graph before backward run

* Fixed minor issues

* Fixed issue with backward graph construction logic

* Fixed implementation issues with backward graph reconstruction

* Fixed unittest issue

* Fixed issues

* [DoubleGrad PR #8] Enabled triple grads for sigmoid and matmul

* Fixed issues with phi kernel

* Added triple grad test case

* Fixed minor issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants