relu supports bfloat16 data type #32542

Avin0323 · 2021-04-25T10:22:13Z

PR types

Others

PR changes

Others

Describe

relu支持bfloat16数据类型。

使用V100 + CUDA11.0 + cudnn8.0 测试如下：

float32计算：
bfloat16计算并转换成float32：

相对误差在0.01以下。

paddle-bot-old · 2021-04-25T10:22:15Z

Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

… relu-support-bf16

AshburnLee · 2021-04-26T04:30:11Z

paddle/fluid/operators/activation_op.cu


 /* ===========================    relu register  ============================ */
+#ifdef PADDLE_WITH_HIP
 REGISTER_ACTIVATION_GPU_KERNEL(relu, Relu, ReluGPUFunctor, ReluGradGPUFunctor);


修改之前CUDA上会执行REGISTER_ACTIVATION_GPU_KERNEL，修改后只在HIP上执行了吗？

REGISTER_ACTIVATION_GPU_KERNEL宏逻辑为：

ROCM平台目前对于bfloat16编译存在问题，这里暂时先不在ROCM平台上支持。

AshburnLee · 2021-04-26T04:43:32Z

python/paddle/fluid/tests/unittests/op_test.py

        # set delta as np.float16, will automatic convert to float32, float64
        delta = np.array(delta).astype(np.float16)
+    elif tensor_to_check_dtype == core.VarDesc.VarType.BF16:
+        tensor_to_check_dtype = np.float32


根据L127~L132，这里应该是将paddle的数据类型转换为对应的numpy的数据类型。所以core.VarDesc.VarType.BF16对应np.uint16吧

BF16类型也是属于浮点型，这里为了单测代码更易读，选择使用float32而不是uint16作为结果检查的类型。

AshburnLee · 2021-04-26T04:57:13Z

Describe中可以提供下单测结果

Avin0323 · 2021-04-26T05:51:36Z

Describe中可以提供下单测结果

done

Xreki · 2021-04-27T06:23:36Z

python/paddle/fluid/tests/unittests/test_activation_op.py

 create_test_act_fp16_class(TestHardSwish)

+
+#------------------ Test BF16 ----------------------


删除这行注释吧。

Xreki · 2021-04-27T06:30:59Z

python/paddle/fluid/tests/unittests/test_activation_op.py

+                     "core is not compiled with CUDA")
+    class TestActBF16(parent):
+        def init_dtype(self):
+            self.dtype = np.uint16


设置dtype为np.uint16，怎么能保证不是用的int16类型的kernel呢？能不能用字符串"bfloat16"？现在是设置dtype=uint16实际都是BF16？

目前测试框架内使用np.uint16来测试bfloat16类型，其转换到paddle内为bfloat16类型，即设置dtype=np.uint16实际都是BF16。

Xreki · 2021-04-27T06:38:18Z

paddle/fluid/operators/activation_op.cu

-                                    ops::ReluGradGradFunctor<plat::float16>>);
+                                    ops::ReluGradGradFunctor<plat::float16>>,
+    ops::ActivationDoubleGradKernel<plat::CUDADeviceContext,
+                                    ops::ReluGradGradFunctor<plat::bfloat16>>);


double grad验证过吗？如果没有的话，先不要注册。

单测中已验证。

注册代码后续PR中还是考虑简化一下。

Xreki · 2021-04-27T06:44:22Z

paddle/fluid/operators/activation_op.cu

+    ops::ActivationGradGPUKernel<plat::CUDADeviceContext,
+                                 ops::ReluGradGPUFunctor<plat::float16>>,
+    ops::ActivationGradGPUKernel<plat::CUDADeviceContext,
+                                 ops::ReluGradGPUFunctor<plat::bfloat16>>);


现在relu调的是ActivationkernelVec和ActivationGradKernelVec这2个CUDA Kernel。核心计算在functor：

Paddle/paddle/fluid/operators/activation_op.cu

Lines 63 to 67 in 0372f1d

__device__ __forceinline__ typename CudaVecType<T>::type Compute(

const typename CudaVecType<T>::type in) {

// relu forward : out = max(x, 0)

return in > zero_ ? in : zero_;

}

CudaVecType<bfloat16>没有特化，CudaVecType<bfloat16>::Type依然的是bfloat16类型。bfloat16类型对于>运算符的重载，比较运算是先强制转换成float后再比较的吧？确认一下。

Paddle/paddle/fluid/platform/bfloat16.h

Lines 283 to 285 in 6f6e159

HOSTDEVICE inline bool operator>(const bfloat16& a, const bfloat16& b) {

return static_cast<float>(a) > static_cast<float>(b);

}

float16类型是有重载这些运算符的。

merge最新代码后，这块的代码有一些改变，已经不再使用特化的形式了。

Xreki · 2021-04-27T06:56:41Z

paddle/fluid/operators/mean_op.cu

+    ops::MeanCUDAGradKernel<paddle::platform::CUDADeviceContext, plat::float16>,
    ops::MeanCUDAGradKernel<paddle::platform::CUDADeviceContext,
-                            plat::float16>);
+                            plat::bfloat16>);


fill_constant和mean为啥要注册bfloat16类型呢？是因为单测框架里面会用到吗？单测框架除了要测试的op外，其他op能不能都改成使用fp32类型？fill_constant倒没有问题，mean可能会引入较大的误差。

已取消fill_constant和mean的注册，改为增加转换成fp32步骤。

paddle-bot-old · 2021-05-04T02:35:31Z

Sorry to inform you that 7a26ab6's CIs have passed for more than 7 days. To prevent PR conflicts, you need to re-run all CIs manually.

Xreki

验证了一下op单测框架中构造的测试program：

double反向测试的program如下：
bfloat16反向测试的program如下，只有relu和relu_grad是使用bfloat16计算：

Xreki · 2021-05-17T06:36:15Z

paddle/fluid/operators/activation_op.cu

-                                    ops::ReluGradGradFunctor<plat::float16>>);
+                                    ops::ReluGradGradFunctor<plat::float16>>,
+    ops::ActivationDoubleGradKernel<plat::CUDADeviceContext,
+                                    ops::ReluGradGradFunctor<plat::bfloat16>>);


注册代码后续PR中还是考虑简化一下。

Xreki · 2021-05-17T06:37:48Z

paddle/fluid/operators/cast_op.cu

+    ops::CastOpKernel<paddle::platform::CUDADeviceContext,
+                      paddle::platform::complex64>,
+    ops::CastOpKernel<paddle::platform::CUDADeviceContext,
+                      paddle::platform::complex128>);


注册代码后续PR中考虑简化一下。

Xreki · 2021-05-18T01:25:50Z

python/paddle/fluid/tests/unittests/op_test.py

+            numpy_tensor = np.array(tensor).astype(np.uint16)
+            numpy_tensor = numpy_tensor.flatten()
+            return struct.unpack('<f', struct.pack('<I', numpy_tensor[i]
+                                                   << 16))[0]


这个是将uint16转换成bf16？后续可以封装成一个比较通用的函数。

Xreki · 2021-05-18T01:29:30Z

python/paddle/fluid/tests/unittests/op_test.py

                no_grad_set)
+            fp32_grads = []
+            for grad in dygraph_grad:
+                if grad.dtype == np.uint16:


语义上是bfloat16，最好字面上也是通过bfloat16判断，因为有一些op可能真的支持uint16类型的计算。或者至少要加一些注释。

下同。

Xreki · 2021-05-18T01:29:55Z

python/paddle/fluid/tests/unittests/op_test.py

+            for grad in dygraph_grad:
+                if grad.dtype == np.uint16:
+                    grad = convert_uint16_to_float(grad)
+                    max_relative_error = 0.03


设大了不通过吗？

目前与cpu上bf16的设置逻辑先保持一致。

luotao1

LGTM for skip unittest

relu supports bfloat16 data type, test=develop

1099408

Avin0323 added 3 commits April 25, 2021 10:34

Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into…

fb7b924

… relu-support-bf16

fix relu unittest with bfloat16 data type, test=develop

1cbdb1b

fix compilation on ROCM, test=develop

7a26ab6

AshburnLee reviewed Apr 26, 2021

View reviewed changes

Avin0323 changed the title ~~[WIP]relu supports bfloat16 data type~~ relu supports bfloat16 data type Apr 26, 2021

Xreki reviewed Apr 27, 2021

View reviewed changes

Avin0323 added 5 commits May 10, 2021 04:54

merge develop

f6cb377

fix tests, test=develop

acf89ba

fix compilation error, test=develop

08ef559

fix bf16 cpu unittests, test=develop

f70adad

fix unittests error, test=develop

394e394

Xreki approved these changes May 18, 2021

View reviewed changes

luotao1 approved these changes May 18, 2021

View reviewed changes

Xreki merged commit bcd40f2 into PaddlePaddle:develop May 18, 2021

Avin0323 mentioned this pull request May 25, 2021

conv2d support bfloat16 #32221

Merged

		create_test_act_fp16_class(TestHardSwish)


		#------------------ Test BF16 ----------------------

	__device__ __forceinline__ typename CudaVecType<T>::type Compute(
	const typename CudaVecType<T>::type in) {
	// relu forward : out = max(x, 0)
	return in > zero_ ? in : zero_;
	}

	HOSTDEVICE inline bool operator>(const bfloat16& a, const bfloat16& b) {
	return static_cast<float>(a) > static_cast<float>(b);
	}

relu supports bfloat16 data type #32542

relu supports bfloat16 data type #32542

Uh oh!

Conversation

Avin0323 commented Apr 25, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR types

PR changes

Describe

Uh oh!

paddle-bot-old bot commented Apr 25, 2021

Uh oh!

AshburnLee Apr 26, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Avin0323 Apr 26, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

AshburnLee commented Apr 26, 2021

Uh oh!

Avin0323 commented Apr 26, 2021

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

paddle-bot-old bot commented May 4, 2021

Uh oh!

Xreki left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

luotao1 left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Avin0323 commented Apr 25, 2021 •

edited

Loading

AshburnLee Apr 26, 2021 •

edited

Loading

Avin0323 Apr 26, 2021 •

edited

Loading

Xreki left a comment •

edited

Loading