Skip to content

refine softmax fwd on CPU#17522

Merged
tensor-tang merged 5 commits intoPaddlePaddle:developfrom
tensor-tang:softmax_fwd
May 23, 2019
Merged

refine softmax fwd on CPU#17522
tensor-tang merged 5 commits intoPaddlePaddle:developfrom
tensor-tang:softmax_fwd

Conversation

@tensor-tang
Copy link
Contributor

@tensor-tang tensor-tang commented May 21, 2019

improve 6X on v2620 v3

Before
image

After
image

test=develop
@tensor-tang tensor-tang requested a review from luotao1 May 21, 2019 06:32
@luotao1 luotao1 requested review from a user and jczaja May 21, 2019 06:49
@luotao1
Copy link
Contributor

luotao1 commented May 21, 2019

If reason is that input is to small to vsEXP then how about on moving vsEXP out of batch loop. vsEXP takes most of time perhaps we could first in a loop (for all batches) compute max and sub and then vsEXP for all data followed by loop executing sums and scals?

@jczaja Do you mean update this PR?

Copy link
Contributor

@jczaja jczaja left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@luotao1 That was just suggestion on how to implement softmax op to deal with small inputs (while batch are bigger). It is not directly related to this PR

jczaja
jczaja previously approved these changes May 21, 2019
Copy link
Contributor

@jczaja jczaja left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@jczaja
Copy link
Contributor

jczaja commented May 21, 2019

@tensor-tang Just one more question: This code is to be run for inference only eg. it is enable when PaddlePaddle for inference is build and no ValueClip is there that is needed in softmax fwd training. But performance results are shown from training which should not contain this optimization ?

@tensor-tang
Copy link
Contributor Author

This is used for training actually.

@tensor-tang
Copy link
Contributor Author

Copy link
Contributor

@luotao1 luotao1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@tensor-tang tensor-tang merged commit 0600b37 into PaddlePaddle:develop May 23, 2019
@tensor-tang tensor-tang deleted the softmax_fwd branch May 23, 2019 11:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants