[XPU] AdamW support multi_precision #61694

houj04 · 2024-02-07T09:27:15Z

PR types

New features

PR changes

OPs

Description

更新XHPC依赖到最新，因为需要使用新增的XDNN的adamw_v2函数。
之前XPU下的AdamW不支持multi_precision为true的情况，现在支持了。
- 针对KL2：通过算子拼接，在使用混合精度的情况下，在优化器中读取master_param，并转换grad的数据类型，并将结果写到master_param_outs里面。
- 针对KL3：单独写了一个函数，最大程度抄paddle/phi/kernels/gpu/adamw_kernel.cu。
把GPU算子实现中，尾部的若干类型注册，同步到XPU上。
把GPU的单测，包括基础计算，以及和混合精度相关的TestAdamWOpMultiPrecisonWithMainGrad、TestAdamWOpMultiPrecison这两个类，同步到XPU上。
跑单测的时候发现有类型注册问题，修改了reduce_mean_grad和reduce_mean_grad的类型注册，追加了float16类型。
顺手修了python端的几个细节typo。
顺手修了FA的某个单测计算阈值，在bfloat16下面稍微放松一点点。

paddle-bot · 2024-02-07T09:27:19Z

你的PR提交成功，感谢你对开源项目的贡献!
请关注后续CI自动化测试结果，详情请参考Paddle-CI手册。
Your PR has been submitted. Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

GuoxiaWang

LGMT fo typo fix

lj970926

LGTM

lj970926 · 2024-02-22T10:26:14Z

test/xpu/test_adamw_op_xpu.py

+    def test_main(self):
+        xpu_version = core.get_xpu_device_version(0)
+        if xpu_version != core.XPUVersion.XPU3:
+            return


这里为什么只针对KL3呀，我看KL2好像也做了支持

因为新写的adamw_v2在XDNN里面只有KL3的实现，当时想着是以KL3为准，KL2不维护，所以这里写的只针对KL3。
另外用算子拼接实现的版本可能也支持KL2，没有特地测试过，稍后我跑一把看看情况。

lj970926 · 2024-02-22T10:27:52Z

test/xpu/test_adamw_op_xpu.py

+        param = paddle.randn(shape).astype(paddle.bfloat16)
+        master_weight = param.astype(paddle.float32)
+        grad = paddle.randn(shape).astype(paddle.bfloat16)
+        main_grad = grad.astype(paddle.bfloat16)


这里的main_grad是干什么用的？我看好像和grad数据和类型都完全一致

感谢指出，稍后我查一下，目前看是为了debug而有些修改，忘记改回去了。

lj970926

LGTM

chenwhql

LGTM for PADDLE_ENFORCE

[XPU] AdamW support multi_precision

f0ffef7

houj04 added 5 commits February 21, 2024 11:00

Merge branch 'develop' into 20240207-adamw-amp

cf2a71c

Merge branch 'develop' into 20240207-adamw-amp

4fe7f76

Merge branch 'develop' into 20240207-adamw-amp

fa6dfb5

Merge branch 'develop' into 20240207-adamw-amp

c17835b

[XPU] use xdnn api adamw_v2

6f92801

GuoxiaWang reviewed Feb 22, 2024

View reviewed changes

Galaxy1458 previously approved these changes Feb 22, 2024

View reviewed changes

Merge branch 'develop' into 20240207-adamw-amp

a319bc1

HarperCy approved these changes Feb 22, 2024

View reviewed changes

lj970926 approved these changes Feb 22, 2024

View reviewed changes

houj04 added 3 commits February 23, 2024 10:28

Merge branch 'develop' into 20240207-adamw-amp

d8cfa12

Merge branch 'develop' into 20240207-adamw-amp

3abf26b

update for KL2

5424a51

houj04 dismissed Galaxy1458’s stale review via 5424a51 February 23, 2024 04:58

lj970926 approved these changes Feb 23, 2024

View reviewed changes

chenwhql approved these changes Feb 23, 2024

View reviewed changes

Galaxy1458 approved these changes Feb 23, 2024

View reviewed changes

QingshuChen approved these changes Feb 23, 2024

View reviewed changes

QingshuChen merged commit 23fdbd1 into PaddlePaddle:develop Feb 23, 2024

houj04 added the XPU label Sep 4, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[XPU] AdamW support multi_precision #61694

[XPU] AdamW support multi_precision #61694

Uh oh!

houj04 commented Feb 7, 2024 •

edited

Loading

Uh oh!

paddle-bot bot commented Feb 7, 2024

Uh oh!

GuoxiaWang left a comment

Uh oh!

lj970926 left a comment

Uh oh!

lj970926 Feb 22, 2024

Uh oh!

houj04 Feb 22, 2024

Uh oh!

lj970926 Feb 22, 2024

Uh oh!

houj04 Feb 22, 2024

Uh oh!

lj970926 left a comment

Uh oh!

chenwhql left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

[XPU] AdamW support multi_precision #61694

[XPU] AdamW support multi_precision #61694

Uh oh!

Conversation

houj04 commented Feb 7, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR types

PR changes

Description

Uh oh!

paddle-bot bot commented Feb 7, 2024

Uh oh!

GuoxiaWang left a comment

Choose a reason for hiding this comment

Uh oh!

lj970926 left a comment

Choose a reason for hiding this comment

Uh oh!

lj970926 Feb 22, 2024

Choose a reason for hiding this comment

Uh oh!

houj04 Feb 22, 2024

Choose a reason for hiding this comment

Uh oh!

lj970926 Feb 22, 2024

Choose a reason for hiding this comment

Uh oh!

houj04 Feb 22, 2024

Choose a reason for hiding this comment

Uh oh!

lj970926 left a comment

Choose a reason for hiding this comment

Uh oh!

chenwhql left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

houj04 commented Feb 7, 2024 •

edited

Loading