-
Notifications
You must be signed in to change notification settings - Fork 5.9k
LinspaceKernel uses the dtype of 'self' as the type of 'step' when tensor is floating #75238
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
你的PR提交成功,感谢你对开源项目的贡献! |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## develop #75238 +/- ##
===========================================
Coverage ? 100.00%
===========================================
Files ? 2
Lines ? 5
Branches ? 0
===========================================
Hits ? 5
Misses ? 0
Partials ? 0 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
d4e74bb to
1b2f256
Compare
A-nnonymous
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need some modifications
| double step = (static_cast<double>(stop_data - start_data)) / (num - 1); | ||
| // step should be of StepT type | ||
| StepT step = | ||
| (static_cast<StepT>(stop_data) - static_cast<StepT>(start_data)) / |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
在StepT为整形时,我们是不是应该添加一个检查来确认:stop-start后的值之间,是否有num-1个有效整数值?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这里的StepT不应该为整数,如果T是整数,应该通过 using StepT = std::conditional_t<std::is_integral_v, double, T>;转为double
| } | ||
|
|
||
| template <typename T, typename StepT> | ||
| __global__ void LinspaceKernelInnerForInt( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
不建议使用两个kernel名字来管理这个linspace功能,可以考虑使用C++的模版特化,使用统一的名称来管理两个kernel
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
| bool isIntegral = | ||
| (t == DataType::UINT8 || t == DataType::INT8 || t == DataType::UINT16 || | ||
| t == DataType::INT16 || t == DataType::UINT32 || t == DataType::INT32 || | ||
| t == DataType::UINT64 || t == DataType::INT64); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
确认一下paddle是否支持超长整型,比如INT128等
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
目前kernel还没有找到int128相关的内容,后面如果添加了再修改这里
wanghuancoder
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
A-nnonymous
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, except performance for now
…nsor is floating (PaddlePaddle#75238) * align LinspaceKernel * update meta * update gpu kernel * fix LinspaceKernelInner * improve kernel
…nsor is floating (PaddlePaddle#75238) * align LinspaceKernel * update meta * update gpu kernel * fix LinspaceKernelInner * improve kernel
* CallScalarFunction uses the dtype of 'self' as the type of 'other' when opotype is 'div'(#75237) * LinspaceKernel uses the dtype of 'self' as the type of 'step' when tensor is floating (#75238) * align LinspaceKernel * update meta * update gpu kernel * fix LinspaceKernelInner * improve kernel * fix CudaSigmoidGradFunctor and CudaSiluGradFunctor (#75341) * Softplus accuracy and torch alignment 1 (#75363) * [Precision Depth Alignment] paddle.tan reverse calculation: dx = dout *(1 + tan(x)^2) (#75335) * Tan reverse calculation: dx = dout *(1 + tan(x)^2) * [Precision Depth Alignment] Add support for CUDNN to paddle.nn.functional.grid_sample to align with torch accuracy. (#75355) * accuracy_stable_grid_sample * fix * correlation supports big tensor (#75383) * fix * fix test * fix * paddle.tanh Grad and torch alignment (float16) (#75454) * [Precision Depth Alignment] paddle.sin and paddle.cos aligns with torch precision. (#75503) * accuracy_stable_sin * accuracy_stable_cos * [深度对齐]Divide (#75379) * fix * fix * fix * fix * fix * [Precision Depth Alignment] fix precision for float16 of paddle.tan backward (#75525) * fix precision for float16 of paddle.tan backward * fix else branch of CudaTanGradFunctor * [Precision Depth Alignment] fix precision for paddle.expm1 (#75549) * accuracy_stable_expm1 * fix * Bigtensor排查修复[Paddle/paddle/phi/kernels/funcs] (#75523) * fix * fix * [Precision Depth Alignment] fix beta and threshold of paddle.nn.functional.softplus to double (#75426) * fix beta and threshold of Softplus to double * fix test_softplus_activation_fuse_pass v1 * fix test_activation_zero * fix flaot of SoftplusDoubleGradKernel to double * add op_patches for softplus * add yaml for ops/yaml/legacy * fix infershape/operator for FLOAT64 * fix * add SoftPlusOpTranscriber * fix * fix * fix1 * fix2 * fix coverage * fix coverage2 * fix (#75605) * [深度对齐] dot (#75717) * fix * fix * fix dcu * [Precision Depth Alignment] paddle.log aligns with torch precision (#75799) * accuracy_stable_log * accuracy_stable_log * fix * fix * fix * fix * fix5 * [Precision Depth Alignment] fix eps of paddle.logit from float to double (#75816) * accuracy_stable_logit * add LogitOpTranscriber * fix coverage * fix 0yaml * [Precision Depth Alignment] paddle.log_sigmoid (#75898) * accuracy_stable_log_sigmoid * fix test_activation_stride_op.py * [Precision Depth Alignment] Modify the negative_slope parameter of the paddle.nn.functional.leaky_relu API to double (#75547) * [big tensor] Paddle/paddle/phi/kernels/funcs gpuBigtensor (#75856) * fix funcs * gpu * fix * fix * 修改PADDLE_ENFORCE信息 * fix cpu error * fix dcu * fix dcu * fix * [Fix] log sigmoid complex (#75953) * feature: Add specialized LogSigmoidFunctor and CudaLogSigmoidFunctor for complex numbers This commit introduces specialized implementations of LogSigmoidFunctor and CudaLogSigmoidFunctor to handle complex number inputs. The new implementations utilize direct formulas for improved accuracy and stability in calculations involving complex types. * refactor: Optimize LogSigmoidFunctor and CudaLogSigmoidFunctor for complex types by caching exp(-x) to reduce redundant computations. This change enhances performance while maintaining accuracy in calculations. * refactor: modified the formula in LogSigmoidFunctor to make it numerical stable --------- Co-authored-by: Zhan Rongrui <[email protected]> Co-authored-by: 正在学习 <[email protected]> Co-authored-by: Bvicii <[email protected]>
PR Category
Operator Mechanism
PR Types
Performance
Description
受影响的 API:
paddle.linspace对齐全部case(共 85 个)修改内容
其他说明
GPU Kernel 在T为整数时,使用float比double性能更好,可以对齐PyTorch。
性能测试代码
TODO:
Pcard-67164