Skip to content

Conversation

@Zjq9409
Copy link
Owner

@Zjq9409 Zjq9409 commented Dec 10, 2021

PR types

Function optimization

PR changes

OPs

Describe

test

tianshuo78520a and others added 30 commits November 25, 2021 14:04
* Fix static-ci
* fix program cache key

* bug fix

* fix cache problem

* remove unused code
* add new API paddle.nn.initializer.Dirac

* fix doc
* block unknown option /arch:SSE3

* modify according to zhouwei's comment
* Support multi-stream allocation for CUDA place

* Do not notify the retrying from other streams when free CUDA allocation

* Fix compile error for CPU

* Fix compile error for HIP

* Release memory for StreamSafeCUDAAllocaRetry in malloc_test

* Add FLAGS_use_stream_safe_cuda_allocator

* Fix CI error for 'set_tests_properties'

* Invalidate stream safe CUDA allocator for naive_best_fit and thread_local strategy

* Performance improvement: insert allocation pair to outstanding_events_map when free but not alloc; replace recursive_mutex with SpinLock

* FLAGS priority changes: FLAGS_use_system_allocator > FLAGS_use_stream_safe_cuda_allocator

* Performance improvement: directly delete allocation when the recorded_streams is empty in FreeImpl of StreamSafeCUDAAllocator

* Add UT for alloc interface

* Changes multi-stream interface; move retry code from AllocatorFacadePrivate to StreamSafeCUDAAllocator
* fix. test=develop

* fix. test=develop

* fix. test=develop

* fix. test=develop

* fix. test=develop

* [heterps]bug fix for _run_from_dataset

* fix heter_server.cc

* fix launch_utils.py

* fix heter_section_worker.cc

* fix. test=develop

* fix. test=develop
* [NPU] add NPU kernel for prior_box op

* [NPU] delete debug codes
* [NPU] add int64 support for argsort op

* [NPU] delete debug codes
* add scalar and scalar_array

* remove DenseTensor include from Scalar and ScalarArray

* remove inner header from scalar_array

* refactor the method of fill_constant and add some comment

* add fill_constant kernel using ScalarArray

* modify some prompt

* remove fill_constant kernel with no shape
* make third_party's cmake get source code directly 2

* modify according to zhouwei's comment

* eager needs mkldnn to compile
* Added GradTensorHolder to Eager Dygraph

* Added accumulation codes to Eager Dygraph

* Added tensor utils to Eager Dygraph

* Resolve compilation issues

* Fixed issues
* block xxhash warning of c4711

* modify according to zhouwei's comment

* fix syntax error
* fix dropout static when axis != None

* update dropout test

* add dropout test

* fix test

* Update test_dropout_op.py

* Update test_dropout_op.py

* fix testcase

* fix testcase

* Update test_dropout_op.py

* fix testcase

* fix testcase

* optimize perf

* add new test

* fix testcase
* add tdm sample

* add tdm sample in c++

* update tdm sample

* modify sample count

* fix conflict

* add set_date

* fix cmake error

* fix bug of proto

* update index_dataset proto

* update cmake

* fix error cmake

* fix cmake mkldnn

* fix cmake proto

* update cmake proto

* update cmake

* update rec

* update dataset

* update dataset

* update dataset

* updata dataset

* updata dataset

* updata coverage

* updata ci

* goback4

* fix npu ci

* add xxhash dep
reset_inplace_version removes all inplace related records to VarBase/VariableWrapper, the essential purpose of which is to let you use inplace operations as if using its non-inplaced version, which of course will cause unexpected consequences if not used with care.

This is essentially a hack interface to satisfy one specific request
…37566)

* Fix bugs when bias is none for static graph for fused_attention op.
* Support parse kernel key by multi-inputs

* optimize code according to reviewer
zyfncg and others added 17 commits December 10, 2021 10:46
* add fc_elementwise_layernorm_fuse_pass

* fix name conflictn

* rebuild CI

* fix Ran Programs=0 bug
* fix pten::Copy use error in redcue_impl

* remove in_dtype args in reduce kernel

* fix copy error

* fix copy error
* dist matmul op compatible

* modify common dist op

* modify common

* add a space
* git ignore eager_op_function_impl.h

* test=document_fix
* add as_complex and as_real op
… factory (#38011)

* add alias kernel name

* modify code as suggestions

* add alias name for matmul and remove redundant member in kernel factory
* remove outer comment when dy2stat

* remove all comment

* add unit test
* fix

* modify log

* fix batch_size
@Zjq9409 Zjq9409 closed this Dec 10, 2021
Zjq9409 pushed a commit that referenced this pull request Dec 27, 2021
…y::Allocation> for Storage (PaddlePaddle#38301)

* Added shared_ptr<Allocation> member & corresponding interfaces to Storage

* Removed original pten::Allocation from Storage and adjusted the interfaces accordingly

* Fixed issues with storage offset

* Used place to malloc allocation for TensorStorage
Zjq9409 pushed a commit that referenced this pull request Feb 15, 2022
* Enabled Eager OpTest #1

* Enabled Eager OpTest #1

* Fixed get_tensor method for EagerTensor
Zjq9409 pushed a commit that referenced this pull request Feb 15, 2022
* #1 migrate dist-related type()-> dtype()

* move datatype function from pten -> fluid/framework

* change type() in imperative into convert(dtype())

* modify xx_tensor->type into xx_tensor->dtype

* change the set_type interface and the caller

* modify xx_tensor.type into xx_tensor.dtype

* fix mutable_data(place, dtype())

* change caller of mutable_data in pten and distributed

* change the caller of mutable_data in fluid/framework

* change the caller of mutable_data in imperative directory

* mutable_data: inference

* update the call of mutable_data

* transfer MakePenScalarArray MakePtenScalar ResetHolderWithType

* pass the compile. the next step is remove VarType in Pten

* fix all and remove VarType from pten. success in linux. Next task is other platform

* fix conflict with develop

* fix compiled error

* Fix reset conversion

* fix conflict

* fix compiled problem

* fix typo

* Fix << in tensor_utils.cc

* fix type->dtype

* fix unittest

* fix tensor init constructor

* fix DataTypeSize for BFloat16

* fix code style

* fix npu compiled error

* fix npu

* compile npu sucessfully

* fix conflict

* fix conflict

Co-authored-by: xiongkun <xiongkun03@baidu.com>
Zjq9409 pushed a commit that referenced this pull request Mar 25, 2022
* [Refactor] refactored eager_gen.py PR #1

* [Refactor] refactored eager_gen.py PR #1

* Refactored version 2

* Added automatic code generation utils

* Fixed merge issues
Zjq9409 pushed a commit that referenced this pull request Mar 30, 2022
…rdFunctions and GradNodes (PaddlePaddle#40937)

* [Refactor] refactored eager_gen.py PR PaddlePaddle#2

* [DoubleGrad PR #1] Decoupled code generation logics for Dygraph ForwardFunctions and GradNodes

* Fixed minor issue
Zjq9409 pushed a commit that referenced this pull request Mar 30, 2022
…nCodes and GenerateForwardDefinition (PaddlePaddle#41016)

* [Refactor] refactored eager_gen.py PR PaddlePaddle#2

* [DoubleGrad PR #1] Decoupled code generation logics for Dygraph ForwardFunctions and GradNodes

* Fixed minor issue

* Adjusted logics of GenerateNodeCreationCodes and GenerateForwardDefinition

* Fixed issues

* Fixed minor issue
Zjq9409 pushed a commit that referenced this pull request Apr 7, 2022
…tion (PaddlePaddle#41051)

* [Refactor] refactored eager_gen.py PR PaddlePaddle#2

* [DoubleGrad PR #1] Decoupled code generation logics for Dygraph ForwardFunctions and GradNodes

* Fixed minor issue

* Adjusted logics of GenerateNodeCreationCodes and GenerateForwardDefinition

* Fixed issues

* Supported higher-order grad node generation

* [DoubleGrad PR PaddlePaddle#4] Supported higher-order GradNode generation

* Fixed yaml typo
Zjq9409 pushed a commit that referenced this pull request Apr 7, 2022
…addlePaddle#41121)

* [Refactor] refactored eager_gen.py PR PaddlePaddle#2

* [DoubleGrad PR #1] Decoupled code generation logics for Dygraph ForwardFunctions and GradNodes

* Fixed minor issue

* Adjusted logics of GenerateNodeCreationCodes and GenerateForwardDefinition

* Fixed issues

* Supported higher-order grad node generation

* [DoubleGrad PR PaddlePaddle#4] Supported higher-order GradNode generation

* [DoubleGrad PaddlePaddle#4] Bug Fixes to Double Grad Node Generation

* Fixed yaml typo

* Fixed yaml typo

* fixed minor issues

* Fixed minor issue
Zjq9409 pushed a commit that referenced this pull request Apr 7, 2022
…_tensors passed to paddle.grad() (PaddlePaddle#41198)

* [Refactor] refactored eager_gen.py PR PaddlePaddle#2

* [DoubleGrad PR #1] Decoupled code generation logics for Dygraph ForwardFunctions and GradNodes

* Fixed minor issue

* Adjusted logics of GenerateNodeCreationCodes and GenerateForwardDefinition

* Fixed issues

* Supported higher-order grad node generation

* [DoubleGrad PR PaddlePaddle#4] Supported higher-order GradNode generation

* [DoubleGrad PaddlePaddle#4] Bug Fixes to Double Grad Node Generation

* Fixed yaml typo

* Fixed yaml typo

* fixed minor issues

* [DoubleGrad PR PaddlePaddle#5] Enabled gradient computations for grad_tensors passed to paddle.grad()

* Fixed minor issue

* Fixed CI-Inference issue

* Fixed CI-inference issues
Zjq9409 pushed a commit that referenced this pull request Apr 7, 2022
…efore backward run (PaddlePaddle#41306)

* [Refactor] refactored eager_gen.py PR PaddlePaddle#2

* [DoubleGrad PR #1] Decoupled code generation logics for Dygraph ForwardFunctions and GradNodes

* Fixed minor issue

* Adjusted logics of GenerateNodeCreationCodes and GenerateForwardDefinition

* Fixed issues

* Supported higher-order grad node generation

* [DoubleGrad PR PaddlePaddle#4] Supported higher-order GradNode generation

* [DoubleGrad PaddlePaddle#4] Bug Fixes to Double Grad Node Generation

* Fixed yaml typo

* Fixed yaml typo

* fixed minor issues

* [DoubleGrad PR PaddlePaddle#5] Enabled gradient computations for grad_tensors passed to paddle.grad()

* Fixed minor issue

* Fixed CI-Inference issue

* Fixed CI-inference issues

* [DoubleGrad PR PaddlePaddle#7] paddle.grad() to copy backward graph before backward run

* Fixed minor issues

* Fixed issue with backward graph construction logic

* Fixed implementation issues with backward graph reconstruction

* Fixed unittest issue

* Fixed issues
Zjq9409 pushed a commit that referenced this pull request Apr 7, 2022
…atmul (PaddlePaddle#41387)

* [Refactor] refactored eager_gen.py PR PaddlePaddle#2

* [DoubleGrad PR #1] Decoupled code generation logics for Dygraph ForwardFunctions and GradNodes

* Fixed minor issue

* Adjusted logics of GenerateNodeCreationCodes and GenerateForwardDefinition

* Fixed issues

* Supported higher-order grad node generation

* [DoubleGrad PR PaddlePaddle#4] Supported higher-order GradNode generation

* [DoubleGrad PaddlePaddle#4] Bug Fixes to Double Grad Node Generation

* Fixed yaml typo

* Fixed yaml typo

* fixed minor issues

* [DoubleGrad PR PaddlePaddle#5] Enabled gradient computations for grad_tensors passed to paddle.grad()

* Fixed minor issue

* Fixed CI-Inference issue

* Fixed CI-inference issues

* [DoubleGrad PR PaddlePaddle#7] paddle.grad() to copy backward graph before backward run

* Fixed minor issues

* Fixed issue with backward graph construction logic

* Fixed implementation issues with backward graph reconstruction

* Fixed unittest issue

* Fixed issues

* [DoubleGrad PR PaddlePaddle#8] Enabled triple grads for sigmoid and matmul

* Fixed issues with phi kernel

* Added triple grad test case

* Fixed minor issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.