complete data layout transform#7440

Merged

jacquesqiao merged 17 commits intoPaddlePaddle:developfrom

jacquesqiao:data-layout-transform

Jan 19, 2018

Member

jacquesqiao commented Jan 11, 2018 •

edited

Loading

fix: #7436

jacquesqiao added 11 commits

January 11, 2018 12:16


          init complete data layout transform

03db455


          can compile

f50d9b1


          test passed

90f33ba


          optimize code

acc4d9a


          Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into…

13059a0

… data-layout-transform


          fix while_grad_op first step loss lod problem

70b7636


          Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into…

2f05d64

… data-layout-transform


          Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into…

040c236

… data-layout-transform


          optimize in out ptr for transform

1d567ae


          add check

fd56a0b


          update copyright

bebe1ce

jacquesqiao changed the title ~~[WIP]init complete data layout transform~~ init complete data layout transform


          clean code

20c1daa

jacquesqiao changed the title ~~init complete data layout transform~~ complete data layout transform


          add NeedTransformLayout

e685b90

jacquesqiao mentioned this pull request

complete mixed device transform #7589

Closed

6 tasks

jacquesqiao requested a review from dzhwinter

January 18, 2018 02:50


          add comment

9a7e143

dzhwinter reviewed

View reviewed changes

paddle/framework/op_kernel_type.h

               inline bool TransFromNeeded(const OpKernelType& l, const OpKernelType& r) {
                 return (!platform::places_are_same_class(l.place_, r.place_)) ||
-                       (l.data_type_ != r.data_type_) || (l.data_layout_ != r.data_layout_);
+                       (l.data_type_ != r.data_type_) ||

Contributor

dzhwinter Jan 18, 2018 •

edited

Loading

I believe we are in the wrong way.

let's start from a simple convolution case:

data -> conv2d -> batch_norm -> fc -> softmax

we may have a quite different layout pipelines choices.

NHWC->NCHW -> NCHW -> NCHW
NHWC->NHWC -> NCHW -> NCHW
NHWC->NHWC -> NHWC -> NCHW
....
NCHW->NCHW -> NCHW -> NCHW
NCHW->NHWC -> NCHW -> NCHW
....

All of them are legal, but only one of them is the best. in cudnn, NHWC is 20% faster than other formats.
So, the answer to select one from above, it's would be NHWC pipeline, no matter what the data format is, just transform the data into NHWC, then go through the data pipeline.

In the case above, I want to claim that we do not run on mixed devices, we can run on different device type..
It means the data pipeline layout can be determined once we have the device information, we only need to transform data before or after the data pipeline.
data pipeline refer to conv2d -> batch_norm -> fc -> softmax, without data input and output.

Contributor

dzhwinter Jan 18, 2018

To be clear, we support multiple device, but not mixed device.

Member Author

jacquesqiao Jan 18, 2018 •

edited

Loading

yes, I Agree with you, we have a data process chain, and there should be some best practice. On the other hand, I also think that the current data transform just does transform according to expected_kernel_type and actual_kernel_type_for_var, it does not consider what kind of expected_kernel_type it should use, I think to find the best expected_kernel_type should be implemented in GetExpectedKernelType by the user or by the framework. So it does not conflict with the current implementation.

dzhwinter reviewed

View reviewed changes

paddle/framework/data_transform.cc Outdated

+                // out_ptr is used to store the result of data transform function.
+                // when one transform is done, the in_ptr should be deleted and the out_ptr
+                // should be assigned to in_ptr for next data transformation.
+                Tensor* out_ptr = new Tensor();

Contributor

dzhwinter Jan 18, 2018

new and delete inside DataTransform looks like awful. Furthermore, the memory optimizer cannot track these actions.
How about create 3 Tensor(for each channel transform) in scope?

Member Author

jacquesqiao Jan 18, 2018 •

edited

Loading

thanks for this comment~~

This data transform is a runtime process, and will not be in ProgramDesc, so memory optimizer can never reach here, it's like some inner logic inside an Operator.

I think here we should have two principles,

inner tensor should not see by outside
inner tensor should be freed as soon as possible.

I will find a better way to do this

paddle/framework/data_transform.cc Outdated

+                                        kernel_type_for_var.data_layout_)) {
+                  TransDataLayout(kernel_type_for_var, expected_kernel_type, *in_ptr,
+                                  out_ptr);
+                  free_tmp_tensor(&input_tensor, in_ptr);

Contributor

dzhwinter Jan 18, 2018

free_tmp_tensor looks awful too...

Member Author

jacquesqiao Jan 18, 2018

updated

jacquesqiao requested a review from reyoung

January 18, 2018 06:08

jacquesqiao mentioned this pull request

Data type transform #7653

Merged

jacquesqiao added 3 commits

January 18, 2018 17:17


          optimize data transform

01eb487


          clean code

8715edb


          use pointer instead of reference

2c23b82

dzhwinter approved these changes

View reviewed changes

Contributor

dzhwinter left a comment

Basically LGTM. Please fix the flaws in Next PR.

paddle/framework/data_transform.cc

                 }
-                PADDLE_ENFORCE_NOT_NULL(out, "out should not be null");
+                PADDLE_ENFORCE(transformed, "no transform is done, please check!");

Contributor

dzhwinter Jan 19, 2018

Comment follow UpperCase, and in written English.
"No transform is applied"

paddle/framework/data_transform.cc

+                Tensor in;
+                in.ShareDataWith(input_tensor);
+                Tensor out;

Contributor

dzhwinter Jan 19, 2018

Where is the datatype transform ?

Member Author

jacquesqiao Jan 19, 2018

in next pr

paddle/framework/data_transform.cc

               namespace paddle {
               namespace framework {
+              static void PassTensorData(Tensor* from, Tensor* to) {

Contributor

dzhwinter Jan 19, 2018

remove static kerword

jacquesqiao merged commit 0071b5f into PaddlePaddle:develop

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet