Init mixed precision training interface by kuke · Pull Request #16856 · PaddlePaddle/Paddle

kuke · 2019-04-15T04:20:18Z

Simple Usage：

Cast specific inputs to float16 (Q: Should we do it automatically?)

  imgs = fluid.layers.cast(images, "float16")

Decorate optimizer and minimize scaled loss

 optimizer = fluid.optimizer.Adam(learning_rate=0.001)
	
 mp_optimizer = fluid.contrib.mixed_precision.decorate(
	            optimizer=optimizer, init_loss_scaling=8.0)
	
 scaled_loss, _, _ = mp_optimizer.minimize(avg_cost)

See test_image_classification_fp16.py for details.

CLAassistant · 2019-04-15T04:20:25Z

All committers have signed the CLA.

test=develop

qingqing01 · 2019-04-15T06:03:30Z

python/paddle/fluid/contrib/mixed_precision/decorator.py

+
+
+class OptimizerWithMixedPrecison(object):
+    def __init__(self, optimizer, init_loss_scaling, use_dynamic_loss_scaling):


Need comments for input arguments. Same as below.

qingqing01 · 2019-04-15T06:09:49Z

python/paddle/fluid/initializer.py

+            out_dtype = VarDesc.VarType.FP32
+            out_var = block.create_var(
+                name=unique_name.generate(".".join(
+                    ['truncated_gaussian_random', 'tmp'])),


Maybe use var.name as part of prefix for unique_name.generate a new name.

qingqing01 · 2019-04-15T06:11:04Z

python/paddle/fluid/initializer.py

+                inputs={"X": out_var},
+                outputs={"Out": var},
+                attrs={"in_dtype": out_var.dtype,
+                       "out_dtype": var.dtype})


Wether the other initializer needs to do this FP32 weight creating and casting?

Yes. They all need. And all initializers have been made to support float16.

qingqing01 · 2019-04-15T06:11:56Z

python/paddle/fluid/contrib/mixed_precision/fp16_utils.py

+        })
+
+
+def copy_to_master_param(p, block):


Need comments for all tehse APIs.

chengduoZH · 2019-04-15T09:26:46Z

python/paddle/fluid/contrib/mixed_precision/fp16_utils.py

+    master_params_grads = []
+    tmp_role = main_prog._current_role
+    OpRole = core.op_proto_and_checker_maker.OpRole
+    main_prog._current_role = OpRole.Backward


You can use _backward_role_guard here.

chengduoZH · 2019-04-15T09:28:13Z

python/paddle/fluid/contrib/mixed_precision/fp16_utils.py

+    new_p = framework.Parameter(
+        block=block,
+        shape=v.shape,
+        dtype=core.VarDesc.VarType.FP32,


Why the dtype must be FP32 here? Is it possible that the dtype is fp64?

In principle, the master parameter can be float64. But there are some hard-coded implementations, and the fp64 support seems not to be that straightforward. So we are going to only support float32 temporarily because it is more common used. Maybe we can go back to consider fp64 some day in the future.

chengduoZH · 2019-04-15T09:30:07Z

python/paddle/fluid/contrib/mixed_precision/fp16_utils.py

+        startup_master_param = startup_prog.global_block()._clone_variable(
+            master_param)
+        startup_p = startup_prog.global_block().var(p.name)
+        cast_fp16_to_fp32(startup_p, startup_master_param, startup_prog)


You should check that p.type is not fp32.

Unified the two cast functions.

chengduoZH · 2019-04-15T09:30:11Z

python/paddle/fluid/contrib/mixed_precision/fp16_utils.py

+        startup_master_param = startup_prog.global_block()._clone_variable(
+            master_param)
+        startup_p = startup_prog.global_block().var(p.name)
+        cast_fp16_to_fp32(startup_p, startup_master_param, startup_prog)


You should check that p.type is not fp32 here.

The same as above

test=develop

gongweibao · 2019-04-22T01:29:42Z

python/paddle/fluid/contrib/mixed_precision/fp16_utils.py

+            # fp16 -> fp32
+            append_cast_op(startup_p, startup_master_param, startup_prog)
+            # cast fp16 gradients to fp32 before apply gradients
+            if g.name.find("batch_norm") > -1:


Can we add attribute to these operator desc instead of hard code?

Next step, wo can optimize such hard code

chengduoZH · 2019-04-22T02:04:24Z

paddle/fluid/API.spec

 paddle.fluid.contrib.multi_download (ArgSpec(args=['client', 'hdfs_path', 'local_path', 'trainer_id', 'trainers', 'multi_processes'], varargs=None, keywords=None, defaults=(5,)), ('document', '100927be598ed8f9eaa1f3ef1b23568a'))
 paddle.fluid.contrib.multi_upload (ArgSpec(args=['client', 'hdfs_path', 'local_path', 'multi_processes', 'overwrite', 'sync'], varargs=None, keywords=None, defaults=(5, False, True)), ('document', '183f34c83d30dbe16e09e8716c41958a'))
 paddle.fluid.contrib.extend_with_decoupled_weight_decay (ArgSpec(args=['base_optimizer'], varargs=None, keywords=None, defaults=None), ('document', 'a1095dfd4ec725747f662d69cd7659d4'))
+paddle.fluid.contrib.decorate (ArgSpec(args=['optimizer', 'init_loss_scaling', 'use_dynamic_loss_scaling'], varargs=None, keywords=None, defaults=(1.0, False)), ('document', '089f0c8d7c03bd3d0edc3ac83dbe41fd'))


The name(decorate) is not explicit.

chengduoZH · 2019-04-22T02:07:52Z

python/paddle/fluid/contrib/mixed_precision/decorator.py

+
+class OptimizerWithMixedPrecison(object):
+    """
+    Optimizer class with mixed-precision training.


You should introduce the implementation of OptimizerWithMixedPrecison detailly and give an example here.

Added more details

gongweibao · 2019-04-22T02:36:35Z

python/paddle/fluid/contrib/tests/test_image_classification_fp16.py

@@ -0,0 +1,301 @@
+#   Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserved.


This file is duplicated?
Can add arguments to the exists unittest instead of a new one?

'Cause this interface is in contrib, we'd better use a seperate unit test file

gongweibao · 2019-04-22T07:50:23Z

python/paddle/fluid/initializer.py

+            out_dtype = VarDesc.VarType.FP32
+            out_var = block.create_var(
+                name=unique_name.generate(".".join(
+                    ['constant_init', var.name, 'tmp'])),


Where are these vars used?

In the startup program

test=develop

chengduoZH · 2019-04-25T06:19:59Z

python/paddle/fluid/contrib/mixed_precision/decorator.py

+    and maintenance of master parameters, scaling of loss, etc.
+
+    Args:
+        optimizer (Optimizer): A common Optimizer object.


-> A Optimizer object.

chengduoZH · 2019-04-25T06:21:27Z

python/paddle/fluid/contrib/mixed_precision/decorator.py

+
+        # Ensure the data type of learning rate vars is float32 (same as the 
+        # master parameter dtype)
+        if isinstance(optimizer._learning_rate, float):


What will happen when the optimizer._learning_rate is not float?

chengduoZH · 2019-04-25T06:23:03Z

python/paddle/fluid/contrib/mixed_precision/decorator.py

+    """
+
+    def __init__(self, optimizer, init_loss_scaling, use_dynamic_loss_scaling):
+        self._optimizer = optimizer


Please check the type of optimizer.

chengduoZH · 2019-04-25T06:48:55Z

python/paddle/fluid/contrib/mixed_precision/fp16_utils.py

+            # fp16 -> fp32
+            append_cast_op(startup_p, startup_master_param, startup_prog)
+            # cast fp16 gradients to fp32 before apply gradients
+            if g.name.find("batch_norm") > -1:


Agreed with @gongweibao, maybe you should get the op that generates g.

chengduoZH · 2019-04-25T06:53:23Z

python/paddle/fluid/contrib/mixed_precision/decorator.py

+        return scaled_loss, optimize_ops, master_params_grads
+
+
+def decorate(optimizer, init_loss_scaling=1.0, use_dynamic_loss_scaling=False):


How about extend_optimizer_with_mixed_precison?

chengduoZH · 2019-04-25T06:53:42Z

python/paddle/fluid/contrib/mixed_precision/decorator.py

+    Args:
+        optimizer(Optimizer): A common Optimizer.
+        init_loss_scaling(float): The initial loss scaling factor.
+        use_dynamic_loss_scaling(bool): Whether to use dynamic loss scaling.


Please give the default value.

chengduoZH · 2019-04-25T07:07:35Z

python/paddle/fluid/contrib/mixed_precision/decorator.py

+
+        Returns:
+            A list of (param, grad), which is a tuple of a parameter and its 
+            gradient respectively, and the scaled loss.


Please explain the master_params?

chengduoZH

The code should be polished.

chengduoZH

The code should be polished.

shanyi15

approve for under contrib menu

# The first commit's message is: remove ut test_dist_word2vec in mac ci, will fix it in private, test=develop (PaddlePaddle#17066) # This is the 2nd commit message: Fleet unify distributed training (PaddlePaddle#16791) * implement distributed transpiler with fleet # This is the 3rd commit message: ParallelDyGraph with GPU collective mode (PaddlePaddle#16827) implement dygraph.parallel.DataParallel to hook reduce op. # This is the 4th commit message: Init mixed precision training interface (PaddlePaddle#16856) * Init mixed precision training interface * Add fp16 test script test=develop * All initializers support float16 test=develop * Code cleanup & add more code annotations test=develop * Update API spec test=develop * Add usage example in doc test=develop # This is the 5th commit message: fix reference_count_pass,test=develop (PaddlePaddle#17060) test=develop # This is the 6th commit message: Speedup roi_perspective_transform op by caching the information of linear interpolation in forward (PaddlePaddle#17090) * Cache the information of linear interpolation in forward and use it in backward. test=develop * Fix cuda kernel. test=develop # This is the 7th commit message: remove unnecessary prepare_data (PaddlePaddle#17080) test=develop # This is the 8th commit message: fix interpolate cu. test=develop (PaddlePaddle#17101) # This is the 9th commit message: test=develop, double backward leaky_relu (PaddlePaddle#17067) backward of backward: leaky_relu # This is the 10th commit message: fix fuse optimizer ops (PaddlePaddle#17102) test=develop # This is the 11th commit message: truncated_gaussian_random supported in distributed training, test=develop (PaddlePaddle#17091) # This is the 12th commit message: Detailed coordinate description for yolov3 loss (PaddlePaddle#17007) * Detailed coordinate description for yolov3 loss test=develop * modified api.spec test=develop * modified loss name * fix api.spec test=develop * polish description test=develop * modified api.spec test=develop # This is the 13th commit message: fix test_weight_decay (PaddlePaddle#17109) test=develop # This is the 14th commit message: Path flag (PaddlePaddle#17105) * fix python/paddle/fluid/__init__.py detecting problems

* refine_dropout_mem,test=develop * # This is a combination of 14 commits. # The first commit's message is: remove ut test_dist_word2vec in mac ci, will fix it in private, test=develop (#17066) # This is the 2nd commit message: Fleet unify distributed training (#16791) * implement distributed transpiler with fleet # This is the 3rd commit message: ParallelDyGraph with GPU collective mode (#16827) implement dygraph.parallel.DataParallel to hook reduce op. # This is the 4th commit message: Init mixed precision training interface (#16856) * Init mixed precision training interface * Add fp16 test script test=develop * All initializers support float16 test=develop * Code cleanup & add more code annotations test=develop * Update API spec test=develop * Add usage example in doc test=develop # This is the 5th commit message: fix reference_count_pass,test=develop (#17060) test=develop # This is the 6th commit message: Speedup roi_perspective_transform op by caching the information of linear interpolation in forward (#17090) * Cache the information of linear interpolation in forward and use it in backward. test=develop * Fix cuda kernel. test=develop # This is the 7th commit message: remove unnecessary prepare_data (#17080) test=develop # This is the 8th commit message: fix interpolate cu. test=develop (#17101) # This is the 9th commit message: test=develop, double backward leaky_relu (#17067) backward of backward: leaky_relu # This is the 10th commit message: fix fuse optimizer ops (#17102) test=develop # This is the 11th commit message: truncated_gaussian_random supported in distributed training, test=develop (#17091) # This is the 12th commit message: Detailed coordinate description for yolov3 loss (#17007) * Detailed coordinate description for yolov3 loss test=develop * modified api.spec test=develop * modified loss name * fix api.spec test=develop * polish description test=develop * modified api.spec test=develop # This is the 13th commit message: fix test_weight_decay (#17109) test=develop # This is the 14th commit message: Path flag (#17105) * fix python/paddle/fluid/__init__.py detecting problems

kuke requested review from chengduoZH and typhoonzero April 15, 2019 04:20

Init mixed precision training interface

4636605

kuke force-pushed the mixed_precision_init branch from 409a90b to 4636605 Compare April 15, 2019 04:23

kuke requested a review from qingqing01 April 15, 2019 04:24

Add fp16 test script

71c14fd

test=develop

qingqing01 reviewed Apr 15, 2019

View reviewed changes

kuke requested review from chenwhql and gongweibao and removed request for chenwhql April 15, 2019 06:15

chengduoZH reviewed Apr 15, 2019

View reviewed changes

Yibing Liu added 3 commits April 18, 2019 11:44

Merge branch 'develop' of upstream into mixed_precision_init

3218c2d

Merge branch 'develop' of upstream into mixed_precision_init

837cb0f

All initializers support float16

9456cef

test=develop

kuke force-pushed the mixed_precision_init branch from 15b00d6 to 9456cef Compare April 21, 2019 09:14

Code cleanup & add more code annotations

c2fa295

test=develop

kuke force-pushed the mixed_precision_init branch from 5f2faba to c2fa295 Compare April 21, 2019 17:26

Update API spec

b2d80ea

test=develop

kuke force-pushed the mixed_precision_init branch from 60bdfb7 to b2d80ea Compare April 21, 2019 18:02

gongweibao reviewed Apr 22, 2019

View reviewed changes

chengduoZH reviewed Apr 22, 2019

View reviewed changes

gongweibao reviewed Apr 22, 2019

View reviewed changes

gongweibao requested changes Apr 22, 2019

View reviewed changes

Yibing Liu added 2 commits April 24, 2019 16:11

Merge branch 'develop' of upstream into mixed_precision_init

21e089f

Add usage example in doc

e3b4499

test=develop

kuke force-pushed the mixed_precision_init branch from 1598ea7 to e3b4499 Compare April 25, 2019 03:36

gongweibao approved these changes Apr 25, 2019

View reviewed changes

chengduoZH reviewed Apr 25, 2019

View reviewed changes

chengduoZH approved these changes Apr 25, 2019

View reviewed changes

shanyi15 approved these changes Apr 25, 2019

View reviewed changes

kuke merged commit beda782 into PaddlePaddle:develop Apr 25, 2019

gongweibao added the AMP label Feb 10, 2020



		class OptimizerWithMixedPrecison(object):
		def __init__(self, optimizer, init_loss_scaling, use_dynamic_loss_scaling):

		@@ -0,0 +1,301 @@
		# Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserved.

		return scaled_loss, optimize_ops, master_params_grads


		def decorate(optimizer, init_loss_scaling=1.0, use_dynamic_loss_scaling=False):

Conversation

kuke commented Apr 15, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

CLAassistant commented Apr 15, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

gongweibao Apr 22, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

chengduoZH left a comment

Choose a reason for hiding this comment

Uh oh!

chengduoZH left a comment

Choose a reason for hiding this comment

kuke commented Apr 15, 2019 •

edited

Loading

CLAassistant commented Apr 15, 2019 •

edited

Loading

gongweibao Apr 22, 2019 •

edited

Loading