[PHI] Fixed logsumexp precision problem by Enigmatisms · Pull Request #72681 · PaddlePaddle/Paddle

Enigmatisms · 2025-05-13T02:55:50Z

PR Category

Operator Mechanism

PR Types

Bug fixes

Description

修复了 paddle.logsumexp 的精度问题，此精度问题甚至在tensor不算特别大的情况下存在。问题例子如下：

def test_logsumexp(config):
    ts = paddle.randn(config[0], dtype = config[1])
    vp = paddle.logsumexp(ts, axis = config[2], keepdim = config[3])
    tst = torch.from_numpy(ts.numpy())

    keepdim = False if config[3] is None else config[3]
    np.testing.assert_allclose(
        torch.logsumexp(tst, axis = config[2], keepdim = keepdim).numpy(), vp.numpy(), rtol=1e-6, atol=1e-6)

def test_single():
    # 这里100%正确
    config = [[2, 131072, 4, 5],"float32", (-1,), False, ]
    test_logsumexp(config)
    
    # 这里会出精度错误（31.25%的值不对）
    config = [[2, 262144, 4, 5],"float32", (-1,), False, ]
    test_logsumexp(config)

测试发现：上述例子中，shape 为 131072 时计算的 grid size 为 32768，但当 shape 变大一倍之后，grid size 没有变大一倍（非65536，而是55296）。实际收到了修改代码中的 GetNumBlocks 函数的影响，此函数基本就是在根据 GPU 所有SM中可用的线程数，计算最大block数量，有一定的性能考虑在内，但在本算子内逻辑不正确：tensor变大时，grid size没有成比例变大、thread内部没有通过 coarsening 做更多工作、block size也没有变化，导致实际上又部分tensor没参与计算。

[Deprecated] 取消GetNumBlocks 函数，grid size 没有上限，依赖调度。可行，但是比较暴力。
目前：修正了函数中，grid size 过大情况下的bug（实际上，只需要修改一行）。

Pcard-89620

paddle-bot · 2025-05-13T02:55:55Z

你的PR提交成功，感谢你对开源项目的贡献!
请关注后续CI自动化测试结果，详情请参考Paddle-CI手册。
Your PR has been submitted. Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

* [PHI] Debug for logsumexp, bug source found * [PHI] Removed GetNumBlocks func to get correct logsumexp * [PHI] Removed redundant debug VLOG * [PHI] Elegant grid bounded solution

* refine forrange (#72360) * refine forrange * refine forrange * reduce support big tensor (#71970) * reduce support big tensor * [PHI] Fix gridDim limit for reduce kernel (#72507) * [API] isclose support bigtensor (#72516) * isclose support bigtensor * refine * [API] isnan isinf isfinite support bigtensor (#72517) * isnan isinf isfinite support bigtensor * refine * [PHI] Fix cum kernel for big tensor (#72562) * [PHI] Preliminary fix for elementwise broadcast int32 shape overflow (#72584) * [PHI] Align linalg.solve kernel with torch (#72608) * Update strided copy kernel (#72662) * [PHI] Fix grid sample kernel for big tensor (#72628) * [PHI] Fix argsort big tensor bug (#72712) * [PHI] Fixed argsort big tensor bug * [PHI] Fixed shape mismatch problem. * [PHI] Fix contiguous kernel for big tensor (#72705) * [PHI] Fix flatten and split kernel for big tensor (#72634) * [PHI] Fix out-of-bound issue of paddle.take_along_axis (#72757) * [PHI] fix paddle.diag with big tensor (#72638) * [API] fix paddle.cross with big tensor (#72652) * [PHI] Fix paddle.where api for big tensor (#72717) * [PHI] Fix bincount kernel for big tensor (#72706) * fix bincount kernel for big tensor * use HostAlloc to alloc memory * add cpu test case * [PHI] Fix full_like kernel for big tensor (#72831) * [API] Fix int overflow and float16 support for paddle.frac (#72815) * [PHI] Align paddle.inner with torch in matmul logic (#72843) * [PHI] Fix paddle.var & paddle.std float16 overflow (#72650) * [PHI] Fix logsumexp precision problem (#72681) * [PHI] Debug for logsumexp, bug source found * [PHI] Removed GetNumBlocks func to get correct logsumexp * [PHI] Removed redundant debug VLOG * [PHI] Elegant grid bounded solution * [Accuracy diff No.55-56、76-77] Fix accuracy diff for var&std API (#72879) * [Accuracy diff No.21] Fix accuracy diff for heaviside API (#72894) --------- Co-authored-by: Shuhao Liang <50269654+lshpku@users.noreply.github.com> Co-authored-by: Qianyue He <46109954+Enigmatisms@users.noreply.github.com> Co-authored-by: Lei Ding <69283446+Dmovic@users.noreply.github.com> Co-authored-by: ggggxm <66855582+ggggxm@users.noreply.github.com> Co-authored-by: xkkkkkk23 <xiekeke@baidu.com> Co-authored-by: Zx <zhangxiao35@baidu.com> Co-authored-by: huangjiyi <43315610+huangjiyi@users.noreply.github.com> Co-authored-by: ooo oo <106524776+ooooo-create@users.noreply.github.com>

Enigmatisms added 2 commits May 13, 2025 02:53

[PHI] Debug for logsumexp, bug source found

563efd6

[PHI] Removed GetNumBlocks func to get correct logsumexp

98d90b7

Enigmatisms added 2 commits May 13, 2025 11:33

[PHI] Removed redundant debug VLOG

71c5072

[PHI] Elegant grid bounded solution

7a1bd63

lshpku approved these changes May 16, 2025

View reviewed changes

lshpku merged commit 3a7b37e into PaddlePaddle:develop May 16, 2025
49 of 50 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[PHI] Fixed logsumexp precision problem#72681

[PHI] Fixed logsumexp precision problem#72681
lshpku merged 4 commits intoPaddlePaddle:developfrom
Enigmatisms:precision_fix1

Enigmatisms commented May 13, 2025 •

edited

Loading

Uh oh!

paddle-bot bot commented May 13, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Enigmatisms commented May 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Category

PR Types

Description

Uh oh!

paddle-bot bot commented May 13, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Enigmatisms commented May 13, 2025 •

edited

Loading