-
Notifications
You must be signed in to change notification settings - Fork 5.9k
Closed
Labels
Description
First part:
- Implement CUDNNDeviceContext, MKLDeviceContext hierarchy.
Use decorator design pattern is not fit our needs so well, doing other things first. - define LayoutType key need to add layout type to fluid #6827 add data layout #6832
- define LibraryType key need to split concepts of
PlaceandLibrary#6770 add library type #6874 - define the new OpKernelType class(four keys DataType/LayoutType/Place/LibraryType). need to refine our design of OpKernelType #6769 refine OpKernelType #6879
- refine current kernel register mechanism
- refine CUDNN related operators, change them from operators to kernels "cudnn operators change to cudnn kernel" #6660
- Remove
CUDNNPlaceandMKLDNNPlacefollow new design - rename GPUPlace to CUDAPlace
- add multikernel python test support
Second part:
- refine Tensor implementation, add a
layoutattribute need to add a data member to represent the Layout of a Tensor #6765 "add data layout" #6955 - refine Python interface and some data operators to set
layout. - share Tensor layout in most operators
Third part:
- DataTransform function interface.
- DataTransformFn register mechanism DataTransformFn register mechanism #6823
- kernel hint and GetExpectedKernelType Impl kernel hint #6883
- memory switch mechanism need to add memory switch mechanism in operator kernel switch #6989 add memory switch mechanism in operator kernel switch #6991
- refine memory switch mechanism in local scope need to refine memory switch mechanism in local scope #7057 cache memory in local scope #7058
- implement some basic DataTransformFn
- CPU <---> CUDA implement CPU <---> CUDA DataTransformFn #7050
- layout transform, like mkldnn
- change batch norm to use new transform method
- add method to judge which inputs should be transformed refs
- Python interface setting kernel hint
- helper function of getting appropriate DeviceContext need to get appropriate device context in switching kernel #7065 add helper function to get appropriate DeviceContext #7066
- Add a mnist example that switch kernel
- Refine current CRF operator