[cherry pick]split minimize and add unscale_ for GradScaler #35927

zhangbo9674 · 2021-09-22T11:48:39Z

PR types

New features

PR changes

APIs

Describe

1、Split function `GradScaler::minimize()` to `GradScaler::step()` + `GradScaler::update()`：

GradScaler::minimize():

    scaler = paddle.amp.GradScaler(init_loss_scaling=1024)

    with paddle.amp.auto_cast():
        output = model(data)
        loss = mse(output, label)

    scaled = scaler.scale(loss)
    scaled.backward()            
    scaler.minimize(optimizer, scaled)
    optimizer.clear_grad()

GradScaler::step() + GradScaler::update():

    scaler = paddle.amp.GradScaler(init_loss_scaling=1024)

    with paddle.amp.auto_cast():
        output = model(data)
        loss = mse(output, label)

    scaled = scaler.scale(loss)
    scaled.backward() 
    scaler.step(optimizer)
    scaler.update()
    optimizer.clear_grad()

minimize() and step()+update() are two methods of parameter gradient updating in amp. In paddle 2.0, we recommend using step()+update().
If optimizer belongs to paddle 1.0, only minimize() can be used.

2、Add `GradScaler::unscale_(optimizer)`:

    scaler = paddle.amp.GradScaler(init_loss_scaling=1024)

    with paddle.amp.auto_cast():
        output = model(data)
        loss = mse(output, label)

    scaled = scaler.scale(loss)
    scaled.backward() 
    scaled.unscale_(optimizer)
    scaler.step(optimizer)
    scaler.update()
    optimizer.clear_grad()

This API is used to unscale the gradients of parameters, multiplies the gradients of parameters by 1/(loss scaling ratio).
If unscale_ is not called, minimize() or step() will call this api, else this call will not be repeated.

3、docs review:

GradScaler:

step+update:

unscale_:

中文文档pr链接：
PaddlePaddle/docs#3897

* split minimize() to step() + update() * add unscale and step for grad_scaler * add unittest * refine code in minimize * delete step in loss_scaler * fix example bug * refine comment * refine unittest * add unittest

paddle-bot-old · 2021-09-22T11:48:50Z

Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

TCChenlong

LGTM

zhiqiu

LGTM

[cherry pick]split minimize and add unscale_ for GradScaler (PaddlePaddle#35927)

TCChenlong previously approved these changes Sep 23, 2021

View reviewed changes

lanxianghit previously approved these changes Sep 23, 2021

View reviewed changes

add update for hyrid parallel

d1906e4

zhangbo9674 dismissed stale reviews from lanxianghit and TCChenlong via d1906e4 September 23, 2021 10:41

TCChenlong approved these changes Sep 26, 2021

View reviewed changes

zhiqiu approved these changes Sep 26, 2021

View reviewed changes

lanxianghit approved these changes Sep 26, 2021

View reviewed changes

lanxianghit merged commit e262125 into PaddlePaddle:release/2.2 Sep 26, 2021

YuanRisheng added a commit to YuanRisheng/Paddle that referenced this pull request Sep 26, 2021

Merge pull request #1 from PaddlePaddle/release/2.2

f58e33c

[cherry pick]split minimize and add unscale_ for GradScaler (PaddlePaddle#35927)

zhangbo9674 deleted the cp/split_minimize_add_unscale branch March 2, 2023 02:56

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[cherry pick]split minimize and add unscale_ for GradScaler #35927

[cherry pick]split minimize and add unscale_ for GradScaler #35927

Uh oh!

zhangbo9674 commented Sep 22, 2021

Uh oh!

paddle-bot-old bot commented Sep 22, 2021

Uh oh!

TCChenlong left a comment

Uh oh!

zhiqiu left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

[cherry pick]split minimize and add unscale_ for GradScaler #35927

[cherry pick]split minimize and add unscale_ for GradScaler #35927

Uh oh!

Conversation

zhangbo9674 commented Sep 22, 2021

PR types

PR changes

Describe

1、Split function GradScaler::minimize() to GradScaler::step() + GradScaler::update()：

2、Add GradScaler::unscale_(optimizer):

3、docs review:

Uh oh!

paddle-bot-old bot commented Sep 22, 2021

Uh oh!

TCChenlong left a comment

Choose a reason for hiding this comment

Uh oh!

zhiqiu left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

1、Split function `GradScaler::minimize()` to `GradScaler::step()` + `GradScaler::update()`：

2、Add `GradScaler::unscale_(optimizer)`: