Skip to content

PaddingRNN model memory optimize#16144

Merged
sneaxiy merged 5 commits intoPaddlePaddle:developfrom
sneaxiy:rnn_mem_opt
Mar 14, 2019
Merged

PaddingRNN model memory optimize#16144
sneaxiy merged 5 commits intoPaddlePaddle:developfrom
sneaxiy:rnn_mem_opt

Conversation

@sneaxiy
Copy link
Collaborator

@sneaxiy sneaxiy commented Mar 11, 2019

Refine cross_entropy and expand ops to save memory. About 12G GPU memory would be saved.

sneaxiy added 2 commits March 11, 2019 04:04
test=develop
@sneaxiy sneaxiy requested a review from chengduoZH March 11, 2019 04:06
framework::slice_ddim(label_dims, 0, rank - 1),
"Input(X) and Input(Label) shall have the same shape "
"except the last dimension.");
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why make a different check here?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just copy from CrossEntropyOp.

public:
using framework::OperatorWithKernel::OperatorWithKernel;

void InferShape(framework::InferShapeContext* ctx) const override {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Most of the logic of InferShape is the same as CrossEntropyOp::InferShape, you can make an inheritance class.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.


template <typename T>
struct CrossEntropyForwardFunctor {
CrossEntropyForwardFunctor(const T *x, T *y, const int64_t *label,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why don't you reuse CrossEntropyKernel?
If platform::ForRange is faster, we should replace CrossEntropyKernel withplatform::ForRange.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not about speed. I just make a different op here to avoid confusing with original cross entropy op.

const int64_t *label_;
int64_t ignore_index_;
int64_t feature_size_;
};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The above code should be placed to cross_entropy.h.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Move to cross_entropy_op_base.h

namespace ops = paddle::operators;
REGISTER_OPERATOR(expand, ops::ExpandOp, ops::ExpandOpMaker,
paddle::framework::DefaultGradOpDescMaker<true>);
ops::ExpandGradOpDescMaker);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should take care of compatibility. The saved training model may be unavailable if we replace DefaultGradOpDescMaker<true> with ops::ExpandGradOpDescMaker directly.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Compatibility is OK. The saved training model may contain extra variable names but it does not matter.

test=develop
@@ -0,0 +1,137 @@
/* Copyright (c) 2016 PaddlePaddle Authors. All Rights Reserved.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2016 -> 2019

sneaxiy added 2 commits March 13, 2019 11:03
test=develop
test=develop
Copy link
Contributor

@chengduoZH chengduoZH left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@sneaxiy sneaxiy merged commit 0b49e43 into PaddlePaddle:develop Mar 14, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants