Skip to content

Conversation

@Caozhou1995
Copy link

PR types

New features

PR changes

Others

Describe

This PR speeds completion by @aoyulong and it will be merged to paddle develop.

wuhuachaocoding and others added 30 commits September 19, 2022 14:13
* refactor mp.

* update setup.py.

* update mp_layers.py for compatibility.

* add documents for mp_layers.py

* update init.py

* update collective.py.

* update.

* update mp_ops.py

* update.

* update code style.

* update code style.
…addlePaddle#46132)

* [PHI] Support bmm and bmm_grad in xpu (PaddlePaddle#45887)

* support bmm and bmm_grad in xpu

* add error removal

* test=kunlun

* refactor code for better structure

* test=kunlun

* add fp16 kernel for bmm

* test=kunlun

* test=kunlun
)

* add unit test for sum higher level op (PaddlePaddle#45961)

* support slice op backward refuse forward and add high level unit test (PaddlePaddle#45960)

* support tile op backward refuse forward (PaddlePaddle#45942)

* support expand_v2 op backward refuse forward (PaddlePaddle#45941)

* support concat backward refuse forward (PaddlePaddle#45940)
…ram (PaddlePaddle#46194)

* [dy2static] support user to use decorator in their program (PaddlePaddle#45768)

* support deco

* fix deco ast type

* arg_str

* 1

* support callable deco

* code style

* codestyle

* test_error

* fix decos in another file

* recover conflict codes

* [BugFix] fixed a bug in decorator transformer, it can not analyze decorator with params correctly (PaddlePaddle#46055)

* fix deco call

* add raise

* add test

* add warn, fix paddle api

* fix error type

* fix coverage
…tion operators (PaddlePaddle#46184)

* [cherry-pick] extend reduce_sum,reduce_sum,eq,ne,ge,abs,pow,etc higher order operators

* add reduce_mean,reduce_sum primitive ops
* add ne_p gt_p primitive operators
* add ge_p abs_p primitive oparators
* add cast primitive operators
* add pow,square prim2oirg rules
* add elementwise_div orig2prim rule

* [cherry-pick] add mean,sum,ge,gt,ne,abs,etc higher-order differentiation operators(PaddlePaddle#45888)

* add reduce_mean,reduce_sum primitive ops

* add ne_p gt_p primitive operators

* add ge_p abs_p primitive oparators
…ecific inputs (PaddlePaddle#46148) (PaddlePaddle#46193)

* fix return order error and duplicate results with specific inputs
* fix wrong eigen header include

* fix complie bug

* fix nan_inf_utils_detail

* fix resource_manager

* fix conv_miopen_helper
* fix static_check error when compile twice (PaddlePaddle#46140)

* [CI] fix static check in build_pr_dev (PaddlePaddle#46192)

Co-authored-by: Zhou Wei <[email protected]>
…addlePaddle#46226)

cherry-pick from PaddlePaddle#45826
LayoutAutotune 支持 inplace 类型的OP
 根据 Add eager layout autotune PaddlePaddle#45409 修改意见调整UseAutotune
将LayoutAutotune判断放到controller中,与AMP 判断保持一致
…ddle#46223)

* add scope cache & reuse

* add gc scope for end of each train step

* del scope reuse for jit

* refine code

* test
…Paddle#46211)

* support cast op backward refuse forward and fix some bugs (PaddlePaddle#46173)

* support cast op backward refuse forward

* Fix the bug of high order unit test framework

* support sign op backward refuse forward (PaddlePaddle#46002)
…46206)

* fix linspace error in amp

* fix log

* fix amp error
cherry-pick : PaddlePaddle#46016, PaddlePaddle#46021, PaddlePaddle#45974

* [Sparse]Sparse add support gpu (PaddlePaddle#45974)

* [Sparse]Remove unused code (PaddlePaddle#46021)

* [Sparse] Add infer meta (PaddlePaddle#46016)
…lePaddle#46094) (PaddlePaddle#46186)

* Fix TransDataBackend Error when call unsqueeze using MKL Tensor

* Add UT

* Refine UT
…dle#46219)

* add config

* add config

* follow comments

* fix serial run
* Support matmul_v2 in Paddle-TensorRT converter.
* Fix bug of reduce_sum op. When input.numel() > INT32_MAX,
its result is wrong.

* Cherry-pick of PR 46045

* Fix bug of reduce_sum kp op.

* Fix bug of reduce_sum kp operator compilation.
If compilation device is XPU, eigen kernel should be ignored.
* [Eager] Fix ocr (PaddlePaddle#46124)

* fix linspace error in amp

* fix log

* fix amp error

* fix ocr error which caused by amp

* add more check

* rename dtype ns

* [Eager Bug fix]Fix Detection (PaddlePaddle#46147)

* fix linspace error in amp

* fix log

* fix amp error

* Revert "Simplify size op impl (PaddlePaddle#45808)"

This reverts commit c252b1d.

* fix_seg

* fix detection

Co-authored-by: Chen Weihang <[email protected]>

Co-authored-by: Chen Weihang <[email protected]>
…ePaddle#46270)

* [Auto Parallel] Change the import way of Auto Parallel (PaddlePaddle#46115)

* fix strategy (PaddlePaddle#46256)

* [Auto Parallel] performance improvement for Sharding-DP hybrid parallelism (PaddlePaddle#46180)

* remove no need grad allreduce communication when sharding-dp

* remove no need grad allreduce communication when sharding-dp

* bugfix

* bugfix

* bugfix

Co-authored-by: Yulong Ao <[email protected]>
Co-authored-by: JZ-LIANG <[email protected]>
…lePaddle#46261)

* polish code comments

* polish data_device_transform.cc
…e#45545) (PaddlePaddle#46280)

* Move ITensor construction for Weight (persistable variable) from OpConvert to TensorRTEngine.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.