refine softmax fwd on CPU by tensor-tang · Pull Request #17522 · PaddlePaddle/Paddle

tensor-tang · 2019-05-21T06:32:09Z

improve 6X on v2620 v3

Before

After

test=develop

luotao1 · 2019-05-21T09:16:58Z

If reason is that input is to small to vsEXP then how about on moving vsEXP out of batch loop. vsEXP takes most of time perhaps we could first in a loop (for all batches) compute max and sub and then vsEXP for all data followed by loop executing sums and scals?

@jczaja Do you mean update this PR?

jczaja

@luotao1 That was just suggestion on how to implement softmax op to deal with small inputs (while batch are bigger). It is not directly related to this PR

paddle/fluid/operators/math/cpu_vec.h

paddle/fluid/operators/math/softmax_impl.h

jczaja

LGTM

jczaja · 2019-05-21T12:33:43Z

@tensor-tang Just one more question: This code is to be run for inference only eg. it is enable when PaddlePaddle for inference is build and no ValueClip is there that is needed in softmax fwd training. But performance results are shown from training which should not contain this optimization ?

tensor-tang · 2019-05-21T13:27:50Z

This is used for training actually.

test=develop

tensor-tang · 2019-05-22T13:13:34Z

http://ci.paddlepaddle.org/viewLog.html?buildId=101697&buildTypeId=Paddle_PrCi&tab=buildLog&branch_Paddle=pull%2F16684 CI failed on test_parallel_executor_seresnext

test=develop

luotao1

LGTM

refine softmax fwd

2d486aa

test=develop

tensor-tang requested a review from luotao1 May 21, 2019 06:32

luotao1 requested review from a user and jczaja May 21, 2019 06:49

tensor-tang mentioned this pull request May 21, 2019

refine cpu softmax bwd #17534

Merged

jczaja reviewed May 21, 2019

View reviewed changes

jczaja previously approved these changes May 21, 2019

View reviewed changes

tensor-tang mentioned this pull request May 21, 2019

Softmax OP (fwd+bwd) need be further optimized #17268

Closed

tensor-tang dismissed jczaja’s stale review via a1035ce May 21, 2019 13:29

tensor-tang force-pushed the softmax_fwd branch from a1035ce to 6d60130 Compare May 22, 2019 05:53

fix compile issue wih gpu

f175176

test=develop

tensor-tang force-pushed the softmax_fwd branch from 6d60130 to f175176 Compare May 22, 2019 06:36

Merge remote-tracking branch 'ups/develop' into softmax_fwd

4fdb79f

test=develop

tensor-tang added 2 commits May 23, 2019 07:44

add value clip to avoid exp

ac15351

Merge remote-tracking branch 'ups/develop' into softmax_fwd

3daa3e6

test=develop

luotao1 approved these changes May 23, 2019

View reviewed changes

tensor-tang merged commit 0600b37 into PaddlePaddle:develop May 23, 2019

tensor-tang deleted the softmax_fwd branch May 23, 2019 11:04

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

refine softmax fwd on CPU#17522

refine softmax fwd on CPU#17522
tensor-tang merged 5 commits intoPaddlePaddle:developfrom
tensor-tang:softmax_fwd

tensor-tang commented May 21, 2019 •

edited

Loading

Uh oh!

luotao1 commented May 21, 2019

Uh oh!

jczaja left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jczaja left a comment

Uh oh!

jczaja commented May 21, 2019

Uh oh!

tensor-tang commented May 21, 2019

Uh oh!

tensor-tang commented May 22, 2019

Uh oh!

luotao1 left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

tensor-tang commented May 21, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

luotao1 commented May 21, 2019

Uh oh!

jczaja left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jczaja left a comment

Choose a reason for hiding this comment

Uh oh!

jczaja commented May 21, 2019

Uh oh!

tensor-tang commented May 21, 2019

Uh oh!

tensor-tang commented May 22, 2019

Uh oh!

luotao1 left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

tensor-tang commented May 21, 2019 •

edited

Loading