Skip to content

create test for quantized resnet50#16399

Merged
luotao1 merged 2 commits intoPaddlePaddle:developfrom
sfraczek:sfraczek/analyzer_int8_resnet50_test
Mar 29, 2019
Merged

create test for quantized resnet50#16399
luotao1 merged 2 commits intoPaddlePaddle:developfrom
sfraczek:sfraczek/analyzer_int8_resnet50_test

Conversation

@sfraczek
Copy link

No description provided.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TIMEOUT 1200 needs 20 minutes?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Once I ran randomly into timeout problem and I found somewhere default is 600 so i doubled it. I didn't realize it was in seconds 😆

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i will leave it to default

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not use

inference_analysis_api_test(test_analyzer_int8_resnet50 ${INT8_RESNET50_INSTALL_DIR} analyzer_int8_resnet50_tester.cc SERIAL)

Copy link
Author

@sfraczek sfraczek Mar 22, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do I pass it --infer_data=${INT8_RESNET50_INSTALL_DIR}/data.txt ?
its a legacy thing. I had problem passing some parameters. I can do as you point out now :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why should SetPasses manually here?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The default order of passes wasn't working

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i tried without it and it works now. I can remove this

Copy link
Author

@sfraczek sfraczek Mar 22, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, actually it gets wrong without this:
image

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The default order of passes wasn't working

If default order of passes wasn't working, should you fix the default order? Otherwise, it is confused for users, since they should not care about the order.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will see what can be done about this.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@luotao1 , @sfraczek , I have fixed the order of mkldnn passes: #16448
and removed use of SetPasses() here.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@luotao1 , @sfraczek , sorry, I have abandoned #16448 and reverted the changes here, as it does not work well in all cases

@sfraczek
Copy link
Author

Hi @luotao1 , this build failed because it requires quantization core PR to work. How should we proceed?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does resnet50_int8v2_data.txt.tar.gz contain 50000 images?

Copy link
Author

@sfraczek sfraczek Mar 25, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, it has only 100. I Set the same data for calibration as for testing.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, how do we test for 50000 images?

Copy link
Author

@sfraczek sfraczek Mar 25, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can create a file like this one but with 5000 images and change the code a little to do that.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can create resnet50_int8v2_full_data.txt.tar.gz with 50000 images, and change the code a litter.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

line23-34 GetPaddleDType could be put in tester_helper.h

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok good idea. Then I can also remove it from analyzer_bert_tester.cc.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why batch_size is 100 here? The default batch_size is 1.

Copy link
Author

@sfraczek sfraczek Mar 25, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I used 100 images as calibration dataset and the same for the test set both with batch 100. I didn't consider what batch size should I use for prediction because the result should be equivalent. For quantization we only support one iteration so I had to use batch bigger than 1. Should I change something here? Is it important for a simple test?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it is not important for this simple test. But we need to test 50000 images on batch_size=1 on local machine. Thus, please enhance the test (maybe a new PR).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, should I close this one then and open a new one with 50000 images?
Should I use first 100 of them for calibration and reuse them during testing of all 50000 images?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, you could go ahead this PR, since 50000 images are too large for our ci to download. Thus, we will only test 50000 images on the local machine.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The default order of passes wasn't working

If default order of passes wasn't working, should you fix the default order? Otherwise, it is confused for users, since they should not care about the order.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

top1 acc quantized-》top1 Int8 accuracy
top1 acc reference-》top1 FP32 accuracy

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This TEST(Analyzer_int8_resnet50, quantization) is only for CompareQuantizedNativeAndAnalysis, how could we do profiler?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will add profiler.

@luotao1
Copy link
Contributor

luotao1 commented Mar 25, 2019

How should we proceed

@sfraczek You should proceed to merge core PR at first.

@sfraczek

This comment has been minimized.

@sfraczek
Copy link
Author

sfraczek commented Mar 25, 2019

@luotao1 I have a problem with both my CMAKE instructions:

inference_download_and_uncompress(${INT8_RESNET50_INSTALL_DIR} "http://paddle-inference-dist.cdn.bcebos.com/int8" "resnet50_int8v2_data.txt.tar.gz")
inference_download_and_uncompress(${INT8_RESNET50_INSTALL_DIR} "http://paddle-inference-dist.cdn.bcebos.com/int8" "resnet50_int8_model.tar.gz" )

After I run cmake and make the files are nowhere to be found. Do you know why that may be happening?

@luotao1
Copy link
Contributor

luotao1 commented Mar 25, 2019

cdn has some problem today. The file http://paddle-inference-dist.cdn.bcebos.com/int8/resnet50_int8v2_data.txt.tar.gz exists.

@sfraczek
Copy link
Author

But I can download it with wget. I had the same problem last week.

@luotao1
Copy link
Contributor

luotao1 commented Mar 25, 2019

We fix it in #16423 temporary, you can update like #16423 do. (change cdn to bj)

@sfraczek sfraczek changed the title create test for quantized resnet50 [wip] create test for quantized resnet50 Mar 25, 2019
@sfraczek
Copy link
Author

sfraczek commented Mar 25, 2019

For now I have merged quantization code because I want to see if CI would pass.
@luotao1 I think there is a problem with setting any default order of passes currently. This is because of separation of mkldnn passes and cpu passes. For example in mobilenet, among other passes, we need the below three passes in mixed order:

  1. depthwise_covn_mkldnn_pass
  2. conv_bn_fuse_pass
  3. conv_relu_mkldnn_fuse_pass

I don't want to make a hasty decision on how to resolve this.

@luotao1
Copy link
Contributor

luotao1 commented Mar 26, 2019

For example in mobilenet, among other passes, we need the below three passes in mixed order

This is confused for users, and hard to use, could you fix them in default order?

@lidanqing-vv
Copy link
Contributor

lidanqing-vv commented Mar 26, 2019

Hi, @luotao1 I have two questions:

  1. Does this PR need to be merged before 29th?
  2. We decided to store the image and labels in binary file which is smaller. For 50 000 images and label, the binary file size is 29G. We plan to implement the test as other analyzer_test in input_slots_all like in analyzer_transformer_tester.cc line 195. Load 29G into ram, do you think it will be ok for baidu validation team?
    Thanks!

@luotao1
Copy link
Contributor

luotao1 commented Mar 26, 2019

Does this PR need to be merged before 29th?

Yes, it should be merged before 29th. You could use the current resnet50_int8v2_data.txt.tar.gz, which is OK for CI test.

We decided to store the image and labels in a binary file which is smaller. For 50 000 images and label, the binary file size is 29G. We plan to implement the test as other analyzer_test in input_slots_all like in analyzer_transformer_tester.cc line 195. Load 29G into ram, do you think it will be ok for Baidu validation team?

  1. The ILSVRC2012_img_val.tar.gz used in int8 V1 is 6.3G, the binary file (with tar.gz) is 29G for int8 v2?
  2. 29G is too big to upload and download from cdn. Could you provide a python script for converting it from ILSVRC2012_img_val.tar.gz?
  3. Load 29G into ram, do you mean loading all the dataset at first? It needs a lot of memory for the machine. Our test machine is
free -g
             total       used       free     shared    buffers     cached
Mem:           376         38        337          0          0         33
-/+ buffers/cache:          4        371
Swap:            0          0          0

@sfraczek
Copy link
Author

sfraczek commented Mar 26, 2019

@luotao1 that archive contains image files and val_list.txt and they read and process the dataset here https://github.intel.com/AIPG/paddle/blob/7c5319ba121a6d73aeba0f06ce158680b160dcc2/python/paddle/fluid/contrib/tests/test_calibration.py#L67

So I can just copy a reader (and transformer with opencv) from our C-api app or we can save a file that is roughly 4 times smaller than previously (because we save them as floats now but we can save them as uchar and convert them to float later and finish the preprocessing left which is mean subtraction and division by stddev).

@Sand3r-
Copy link
Contributor

Sand3r- commented Mar 26, 2019

@luotao1

For example in mobilenet, among other passes, we need the below three passes in mixed order

This is confused for users, and hard to use, could you fix them in default order?

I have refined the default pass order, by adding conv batch norm passes to the list of mkl-dnn passes. That way:

  • The correct order of pass execution is preserved and the fusings behave as intended
  • There is no need for user to manually set passes for his/her application

@sfraczek
Copy link
Author

  1. @luotao1 I've edited the code to read data.bin file, and meanwhile @lidanqing-intel has created the data.bin file with the test 100 images. It fixes the 2% accuracy diff on mobilenet because the dataset is the same as @hshen14 . I will soon push the fixes and please accept the data.bin from @lidanqing-intel and put on the server so I can modify cmake with the correct path.

  2. I've replaced the fake dataset with the small data.bin too.

@luotao1
Copy link
Contributor

luotao1 commented Mar 27, 2019

please update this PR after #16396 and #16490 merged.

@sfraczek sfraczek changed the title [wip] create test for quantized resnet50 create test for quantized resnet50 Mar 27, 2019
@sfraczek
Copy link
Author

sfraczek commented Mar 27, 2019

it is good for CI now. Works with small dataset and profiler should work with custom batch_size on 50000 images.
However, reporting accuracy of a full 50000 images dataset will require some further work because it only checks it for one batch currently (100) in quantizer test.

@luotao1
Copy link
Contributor

luotao1 commented Mar 28, 2019

[22:07:51]	149/609 Test #165: test_analyzer_int8_resnet50 .....................***Exception: SegFault  2.71 sec
[22:07:51]	[==========] Running 2 tests from 1 test case.
[22:07:51]	[----------] Global test environment set-up.
[22:07:51]	[----------] 2 tests from Analyzer_int8_resnet50
[22:07:51]	[ RUN      ] Analyzer_int8_resnet50.quantization
[22:07:51]	/paddle/paddle/fluid/inference/tests/api/analyzer_int8_image_classification_tester.cc:114: Failure
[22:07:51]	Failed
[22:07:51]	Couldn't open file: /root/.cache/inference_demo/int8/data.bin
[22:07:51]	
[22:07:51]	        Start 167: test_analyzer_bert
[22:07:52]	150/609 Test #166: test_analyzer_int8_mobilenet ....................***Exception: SegFault  2.72 sec
[22:07:52]	[==========] Running 2 tests from 1 test case.
[22:07:52]	[----------] Global test environment set-up.
[22:07:52]	[----------] 2 tests from Analyzer_int8_resnet50
[22:07:52]	[ RUN      ] Analyzer_int8_resnet50.quantization
[22:07:52]	/paddle/paddle/fluid/inference/tests/api/analyzer_int8_image_classification_tester.cc:114: Failure
[22:07:52]	Failed
[22:07:52]	Couldn't open file: /root/.cache/inference_demo/int8/data.bin

test=develop
@sfraczek
Copy link
Author

@luotao1 what is the current address of the file? I changed it to http://paddle-inference-dist.bj.bcebos.com/int8/imagenet_val_100.bin.tar.gz I thought it was shared with you.

@luotao1
Copy link
Contributor

luotao1 commented Mar 28, 2019

https://paddle-inference-dist.bj.bcebos.com/int8/imagenet_val_100.tar.gz @sfraczek

@lidanqing-vv
Copy link
Contributor

Hi @luotao1 Where should I put the python script for preprocessing and generating bin file?

@luotao1
Copy link
Contributor

luotao1 commented Mar 28, 2019

please put in the same location with #16515

@luotao1
Copy link
Contributor

luotao1 commented Mar 28, 2019

126: ---  detected 12 subgraphs
126: Fused graph 16
126: --- Running IR pass [depthwise_conv_mkldnn_pass]
126: --- Running IR pass [conv_bn_fuse_pass]
126: --- Running IR pass [conv_eltwiseadd_bn_fuse_pass]
126: --- Running IR pass [conv_bias_mkldnn_fuse_pass]
126: --- Running IR pass [conv3d_bias_mkldnn_fuse_pass]
126: --- Running IR pass [conv_relu_mkldnn_fuse_pass]
126: ---  detected 16 subgraphs
126: --- Running IR pass [conv_elementwise_add_mkldnn_fuse_pass]
126: Fused graph 0
126: --- Running IR pass [depthwise_conv_mkldnn_pass]
126: --- Running IR pass [conv_bn_fuse_pass]
126: --- Running IR pass [conv_eltwiseadd_bn_fuse_pass]
126: --- Running IR pass [conv_bias_mkldnn_fuse_pass]
126: --- Running IR pass [conv3d_bias_mkldnn_fuse_pass]
126: --- Running IR pass [conv_relu_mkldnn_fuse_pass]
126: --- Running IR pass [conv_elementwise_add_mkldnn_fuse_pass]
126: Fused graph 0
126: --- Running IR pass [cpu_quantize_placement_pass]
126: --- Running IR pass [depthwise_conv_mkldnn_pass]
126: --- Running IR pass [conv_bn_fuse_pass]
126: --- Running IR pass [conv_eltwiseadd_bn_fuse_pass]
126: --- Running IR pass [conv_bias_mkldnn_fuse_pass]
126: --- Running IR pass [conv3d_bias_mkldnn_fuse_pass]
126: --- Running IR pass [conv_relu_mkldnn_fuse_pass]
126: --- Running IR pass [conv_elementwise_add_mkldnn_fuse_pass]
126: Fused graph 0
126: --- Running IR pass [cpu_quantize_placement_pass]

There are duplicated passes, similar problem with #16174. We can fix it after May 29th.

@lidanqing-vv
Copy link
Contributor

lidanqing-vv commented Mar 28, 2019

please put in the same location with #16515

Thanks. I will put in the same position. But could I push this file to this PR #16399 ?

@luotao1
Copy link
Contributor

luotao1 commented Mar 28, 2019

please do not push the script to this PR. @lidanqing-intel

@lidanqing-vv
Copy link
Contributor

please do not push the script to this PR. @lidanqing-intel

ok I will not, it will cause CI restart

const PaddlePredictor::Config *qconfig,
const std::vector<std::vector<PaddleTensor>> &inputs) {
PrintConfig(config, true);
std::vector<PaddleTensor> analysis_outputs, quantized_outputs;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add some log like LOG(INFO) << "FP32 start..." before line470 andLOG(INFO) << "INT8 start..." before line471?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we do that in next PR?

Copy link
Contributor

@luotao1 luotao1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I merge this PR at first, please refine #16532 later.

@luotao1 luotao1 merged commit 5b24002 into PaddlePaddle:develop Mar 29, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants