Add conv reqantize squash by wozna · Pull Request #18754 · PaddlePaddle/Paddle

wozna · 2019-07-23T09:12:27Z

This squash improves the accuracy of inference on the GoogLeNet model on ImageNet data.

FP32: avg top1 accuracy: 0.7050

Using INT8 and mkldnn accuracy increases from
INT8: avg top1 accuracy: 0.7017
to
INT8: avg top1 accuracy: 0.7022

This is reopen of #18676

test=develop

bingyanghuang · 2019-07-24T01:18:49Z

@wozna Please add the test=develop to trigger the CI in your latest commit.

test=develop

bingyanghuang · 2019-07-30T00:27:18Z

@wojtuss Please help review this PR.

wojtuss

LGTM

luotao1 · 2019-07-30T08:58:02Z

I will double-check it ASAP.

Sand3r-

@wozna Thank you so much for ensuring such an advanced test coverage for your changes and for the introduced changes themselves. Please consider conforming to the suggestions provided below.

paddle/fluid/framework/ir/mkldnn/cpu_quantize_squash_pass_tester.cc

luotao1 · 2019-07-30T12:04:15Z

before:

I0730 18:03:55.057801 45052 tester_helper.h:462] --- Performance summary ---
I0730 18:03:55.057843 45052 tester_helper.h:463] FP32: avg fps: 33.5400, avg latency: 29.8152 ms
I0730 18:03:55.057854 45052 tester_helper.h:466] INT8: avg fps: 68.2523, avg latency: 14.6515 ms
I0730 18:03:55.067595 45052 tester_helper.h:447] --- Accuracy summary ---
I0730 18:03:55.067620 45052 tester_helper.h:448] Accepted top1 accuracy drop threshold: 0.01. (condition: (FP32_top1_acc - INT8_top1_acc) <= threshold)
I0730 18:03:55.067639 45052 tester_helper.h:451] FP32: avg top1 accuracy: 0.7050
I0730 18:03:55.067646 45052 tester_helper.h:453] INT8: avg top1 accuracy: 0.7008

after:

I0730 20:00:23.235113 294204 tester_helper.h:462] --- Performance summary ---
I0730 20:00:23.235152 294204 tester_helper.h:463] FP32: avg fps: 34.1651, avg latency: 29.2696 ms
I0730 20:00:23.235173 294204 tester_helper.h:466] INT8: avg fps: 82.6219, avg latency: 12.1033 ms
I0730 20:00:23.244776 294204 tester_helper.h:447] --- Accuracy summary ---
I0730 20:00:23.244801 294204 tester_helper.h:448] Accepted top1 accuracy drop threshold: 0.01. (condition: (FP32_top1_acc - INT8_top1_acc) <= threshold)
I0730 20:00:23.244819 294204 tester_helper.h:451] FP32: avg top1 accuracy: 0.7050
I0730 20:00:23.244824 294204 tester_helper.h:453] INT8: avg top1 accuracy: 0.7003

from 70.08->70.03, machie is Intel(R) Xeon(R) Gold 6271 CPU @ 2.60GHz
develop commit: cfcb96d

test=develop

bingyanghuang · 2019-07-31T01:01:49Z

Based on LuoTao's benchmark, performance increase from 68.2523 to 82.6219, and the accuracy only drop from 0.7008 to 0.7003. I think this PR is good to merge. What do you think about? @wojtuss

wozna · 2019-07-31T11:45:04Z

I checked it on Intel(R) Xeon(R) Gold 6248 CPU @ 2.50GHz
and I got the same results.

before

I0731 11:36:49.323259 42280 tester_helper.h:462] --- Performance summary ---
 I0731 11:36:49.323282 42280 tester_helper.h:463] FP32: avg fps: 582.5014, avg latency: 1.7167 ms
 I0731 11:36:49.323288 42280 tester_helper.h:466] INT8: avg fps: 1010.2657, avg latency: 0.9898 ms
 I0731 11:36:49.323566 42280 tester_helper.h:447] --- Accuracy summary ---
 I0731 11:36:49.323572 42280 tester_helper.h:448] Accepted top1 accuracy drop threshold: 0.01 (condition: (FP32_top1_acc - INT8_top1_acc) <= threshold)
 I0731 11:36:49.323578 42280 tester_helper.h:451] FP32: avg top1 accuracy: 0.7050
 I0731 11:36:49.323582 42280 tester_helper.h:453] INT8: avg top1 accuracy: 0.7017

after

I0731 11:27:40.606056 39099 tester_helper.h:462] --- Performance summary ---
 I0731 11:27:40.606074 39099 tester_helper.h:463] FP32: avg fps: 600.3001, avg latency: 1.6658 ms
 I0731 11:27:40.606081 39099 tester_helper.h:466] INT8: avg fps: 1146.9660, avg latency: 0.8719 ms
 I0731 11:27:40.606254 39099 tester_helper.h:447] --- Accuracy summary ---
 I0731 11:27:40.606259 39099 tester_helper.h:448] Accepted top1 accuracy drop threshold: 0.01. (condition: (FP32_top1_acc - INT8_top1_acc) <= threshold)
 I0731 11:27:40.606263 39099 tester_helper.h:451] FP32: avg top1 accuracy: 0.7050
 I0731 11:27:40.606267 39099 tester_helper.h:453] INT8: avg top1 accuracy: 0.7022

luotao1 · 2019-07-31T13:13:22Z

Could you provide:

commit_id
cmake command

Besides, I run with OMP_NUM_THREADS=1.

wozna · 2019-07-31T13:24:11Z

commit_id
It has to be all 3 commits
f469749
c486c21
16a09d5

cmake command
/Paddle/build/paddle/fluid/inference/tests/api/test_analyzer_int8_image_classification "ARGS" "--infer_model=/Paddle/build/third_party/inference_demo/int8v2/googlenet/model" "--infer_data=/data/PaddlePaddle/1G/imagenet/val.bin" "--warmup_batch_size=100" "--batch_size=50" "--paddle_num_threads=28" "--iterations=1000"

Besides, I run with OMP_NUM_THREADS=1.
I will try with this config.

bingyanghuang · 2019-08-01T04:41:42Z

I did the test in our CLX6248 with commit 233746d , and command line is:

./paddle/fluid/inference/tests/api/test_analyzer_int8_image_classification --infer_model=third_party/inference_demo/int8v2/googlenet/model --infer_data=/~/.cache/paddle/dataset/int8/download/int8_full_val.bin --batch_size=1 --paddle_num_threads=1

Got the following results.

Before

I0801 08:46:15.091809 262332 tester_helper.h:462] --- Performance summary ---
I0801 06:41:59.026803 249389 tester_helper.h:463] FP32: avg fps: 43.5602, avg latency: 22.9567 ms
I0801 06:41:59.026811 249389 tester_helper.h:466] INT8: avg fps: 93.9150, avg latency: 10.6479 ms
I0801 06:41:59.038902 249389 tester_helper.h:447] --- Accuracy summary ---
I0801 06:41:59.038915 249389 tester_helper.h:448] Accepted top1 accuracy drop threshold: 0.01. (condition: (FP32_top1_acc - INT8_top1_acc) <= threshold)
I0801 06:41:59.038921 249389 tester_helper.h:451] FP32: avg top1 accuracy: 0.7050
I0801 06:41:59.038925 249389 tester_helper.h:453] INT8: avg top1 accuracy: 0.7008

After

I0801 08:46:15.091809 262332 tester_helper.h:462] --- Performance summary ---
I0801 08:46:15.091835 262332 tester_helper.h:463] FP32: avg fps: 44.4318, avg latency: 22.5064 ms
I0801 08:46:15.091846 262332 tester_helper.h:466] INT8: avg fps: 117.5707, avg latency: 8.5055 ms
I0801 08:46:15.101172 262332 tester_helper.h:447] --- Accuracy summary ---
I0801 08:46:15.101187 262332 tester_helper.h:448] Accepted top1 accuracy drop threshold: 0.01. (condition: (FP32_top1_acc - INT8_top1_acc) <= threshold)
I0801 08:46:15.101194 262332 tester_helper.h:451] FP32: avg top1 accuracy: 0.7050
I0801 08:46:15.101199 262332 tester_helper.h:453] INT8: avg top1 accuracy: 0.7003

Same conclusion as luotao. @wozna wojtek is planning to investigate the server configuration problem, will figure out the reason why we got the different results.

Sand3r-

The code looks good to me. 👍 Good job @wozna. It's nearly a 25% of a speedup topologywise.

bingyanghuang · 2019-08-13T06:48:08Z

@luotao1 Please start a review.

bingyanghuang · 2019-08-13T07:14:44Z

Based on luotao's benchmark on 6271:

INT8 performance got about 21% speedup but INT8 accuracy got 0.0005 accuracy drop . Since 0.0005 is minor compared with big performance gain, we decided to merge this PR @Sand3r- @wojtuss @luotao1

luotao1 · 2019-08-13T07:45:36Z

Got it.

Add requantize squash

f469749

test=develop

wojtuss added Intel int8 labels Jul 23, 2019

wozna force-pushed the squash_wozna branch from 123bd30 to 6531974 Compare July 24, 2019 07:26

Add more precise tests

c486c21

test=develop

wozna force-pushed the squash_wozna branch 2 times, most recently from fc53c15 to c486c21 Compare July 29, 2019 07:16

wojtuss previously approved these changes Jul 30, 2019

View reviewed changes

wozna mentioned this pull request Jul 30, 2019

Add conv dequant squash #18905

Merged

Sand3r- reviewed Jul 30, 2019

View reviewed changes

wozna dismissed wojtuss’s stale review via 67b52ea July 30, 2019 10:47

REname and REfactor tester

16a09d5

test=develop

wozna force-pushed the squash_wozna branch from 67b52ea to 16a09d5 Compare July 30, 2019 13:40

Sand3r- approved these changes Aug 7, 2019

View reviewed changes

luotao1 merged commit 492a00f into PaddlePaddle:develop Aug 13, 2019

Sand3r- mentioned this pull request Aug 30, 2019

test_slim_int8_googlenet, test_analyzer_int8_googlenet etc fails on new 5117 machine #19505

Closed

wozna deleted the squash_wozna branch February 24, 2023 15:42

Conversation

wozna commented Jul 23, 2019

Uh oh!

bingyanghuang commented Jul 24, 2019

Uh oh!

bingyanghuang commented Jul 30, 2019

Uh oh!

wojtuss left a comment

Choose a reason for hiding this comment

Uh oh!

luotao1 commented Jul 30, 2019

Uh oh!

Sand3r- left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

luotao1 commented Jul 30, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

bingyanghuang commented Jul 31, 2019

Uh oh!

wozna commented Jul 31, 2019 • edited by luotao1 Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

luotao1 commented Jul 31, 2019

Uh oh!

wozna commented Jul 31, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

bingyanghuang commented Aug 1, 2019

Uh oh!

Sand3r- left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

bingyanghuang commented Aug 13, 2019

Uh oh!

bingyanghuang commented Aug 13, 2019

Uh oh!

luotao1 commented Aug 13, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

luotao1 commented Jul 30, 2019 •

edited

Loading

wozna commented Jul 31, 2019 •

edited by luotao1

Loading

wozna commented Jul 31, 2019 •

edited

Loading

Sand3r- left a comment •

edited

Loading