[Accuracy diff No.78、142、143] Improve get_numpy_tensor for rpow and pow #528

zrr1999 · 2025-08-18T13:24:07Z

受影响API

paddle.Tensor.__pow__
paddle.Tensor.pow
paddle.pow
paddle.Tensor.__rpow__

修改思路

这些 API 的计算公式为 $y = a^b$，因此需要根据输入类型分三种情况考虑：

a,b都是tensor，限制a为正数，因为负数的部分非整数（例如1/2,3/4这种）次方不在实数域，后续可能需要继续修改。
仅b是tensor，按照推导A的结果限制随机生成的数据。
仅a是tensor，按照推导B的结果限制随机生成的数据（当常数b不为整数时额外限制a不生成负数）。

合入后剩余case

paddle.pow(Tensor([2, 3, 4],"float32"), Tensor([],"float32"), )
paddle.pow(Tensor([20, 1],"float32"), Tensor([],"float32"), )
paddle.pow(Tensor([20000, 1],"float32"), Tensor([],"float32"), )
paddle.pow(Tensor([20600, 1],"float32"), Tensor([],"float32"), )
paddle.pow(Tensor([4, 3, 2],"float32"), Tensor([4, 3, 2],"float16"), )
paddle.pow(Tensor([4, 3, 2],"float64"), Tensor([4, 3, 2],"float16"), )
paddle.pow(Tensor([4, 3, 2],"float64"), Tensor([4, 3, 2],"float32"), )
paddle.pow(Tensor([5, 9, 7],"float64"), Tensor([7],"float64"), )
paddle.pow(Tensor([],"float32"), Tensor([209],"float32"), )

推导A

设MAX是dtype支持的最大值，B是常数底，不妨设B>1
则
y = B^x
dy/dx = y*lnB
为了保证不溢出，需要y<MAX，dy/dx <MAX

令y = MAX，则
B^x=MAX
xlnB=lnMAX
x = lnMAX/lnB

令dy/dx = MAX
则 x = ln(MAX/lnB)/lnB

注：
当0<B<1时，可以通过1/B限制，因为此时y和dy/dx取得最大值的时机是x取最小值，x的最小值和最大值互为相反数，所以这与令B变为倒数是等价的

推导B

设MAX是dtype支持的最大值，B是常数，假设设x!=0,B!=1,B!=0（有特殊处理）
则
y = x^B
dy/dx = Bx^(B-1)
为了保证不溢出，需要y<MAX，dy/dx <MAX
令y = MAX，则
x^B=MAX
x=MAX^(1/B)
令dy/dx = MAX
则 x = (MAX/B)^(1/(B-1))

为了简化实现和优化性能，当B小于2的时候，采用固定值

因为梯度的限制计算过于繁琐，考虑进行放缩，找到一个更小的好计算的上界即可
(MAX/B)^(1/(B-1)) >(MAX/B)^(1/B)
发现MAX^(1/B)>(MAX/B)^(1/B)，所以只需要满足x<(MAX/B)^(1/B)即可。

备注

cast和astype相关精度问题已在其他PR修复。

paddle-bot · 2025-08-18T13:24:12Z

Thanks for your contribution!

wanghuancoder

LGTM

improve pow

5de6c19

zrr1999 changed the title ~~Improve get_numpy_tensor for rpow and pow~~ [Accuracy diff No.78、142、143] Improve get_numpy_tensor for rpow and pow Aug 19, 2025

luotao1 mentioned this pull request Aug 19, 2025

【开源任务】Paddle CPU/GPU Kernel 精度问题推全 PaddlePaddle/Paddle#72667

Open

zrr1999 mentioned this pull request Aug 20, 2025

Improve PowKernel and PowGradKernel for GPU PaddlePaddle/Paddle#74638

Merged

zrr1999 added 2 commits August 20, 2025 13:03

add is_base_arg and int(const) != const

d945a31

update

823c0e1

wanghuancoder approved these changes Aug 22, 2025

View reviewed changes

wanghuancoder merged commit f5cccef into PFCCLab:main Aug 22, 2025

This was referenced Sep 11, 2025

LinspaceKernel uses the dtype of 'self' as the type of 'step' when tensor is floating PaddlePaddle/Paddle#75238

Merged

CallScalarFunction uses the dtype of 'self' as the type of 'other' when opotype is 'div' PaddlePaddle/Paddle#75237

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Accuracy diff No.78、142、143] Improve get_numpy_tensor for rpow and pow #528

[Accuracy diff No.78、142、143] Improve get_numpy_tensor for rpow and pow #528

Uh oh!

zrr1999 commented Aug 18, 2025 •

edited

Loading

Uh oh!

paddle-bot bot commented Aug 18, 2025

Uh oh!

wanghuancoder left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

[Accuracy diff No.78、142、143] Improve get_numpy_tensor for rpow and pow #528

[Accuracy diff No.78、142、143] Improve get_numpy_tensor for rpow and pow #528

Uh oh!

Conversation

zrr1999 commented Aug 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

受影响API

修改思路

推导A

推导B

相关 PR

备注

Uh oh!

paddle-bot bot commented Aug 18, 2025

Uh oh!

wanghuancoder left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

zrr1999 commented Aug 18, 2025 •

edited

Loading