remove autocast() block; eval in fp16 by taoleicn · Pull Request #167 · asappresearch/sru

taoleicn · 2021-03-03T00:23:27Z

This PR removes the torch.amp.autocast(enabled=False) block in sru.ops.elementwise_recurrence_gpu. This would simplify the code and make SRU compatible with both Pytorch Native AMP and apex.

The amp_recurrence_fp16 controls if the elementwise recurrence kernel can be run in fp16 or not. After some testing, we believe this can be made default True, which gives additional speed-up and memory footprint reduction.

In addition, we made minor changes to the SRUpp experimental code to let dev evaluation running in fp16.

taoleicn · 2021-03-03T18:14:24Z

Added torch.clear_autocast_cache to avoid OOM related bugs:
huggingface/transformers#8403
pytorch/pytorch#48049

remove autocast() block; eval in fp16

fd16267

taoleicn requested a review from hpasapp March 3, 2021 00:23

hpasapp approved these changes Mar 3, 2021

View reviewed changes

taoleicn merged commit dcdd2a5 into 3.0.0-dev Mar 3, 2021

taoleicn mentioned this pull request Mar 3, 2021

Enable both Pytorch native AMP and Nvidia APEX AMP for SRU #158

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

remove autocast() block; eval in fp16#167

remove autocast() block; eval in fp16#167
taoleicn merged 1 commit into3.0.0-devfrom
3.0.0-dev-tao

taoleicn commented Mar 3, 2021 •

edited

Loading

Uh oh!

taoleicn commented Mar 3, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

taoleicn commented Mar 3, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

taoleicn commented Mar 3, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

taoleicn commented Mar 3, 2021 •

edited

Loading