create test for quantized resnet50#16399
create test for quantized resnet50#16399luotao1 merged 2 commits intoPaddlePaddle:developfrom sfraczek:sfraczek/analyzer_int8_resnet50_test
Conversation
There was a problem hiding this comment.
TIMEOUT 1200 needs 20 minutes?
There was a problem hiding this comment.
Once I ran randomly into timeout problem and I found somewhere default is 600 so i doubled it. I didn't realize it was in seconds 😆
There was a problem hiding this comment.
why not use
inference_analysis_api_test(test_analyzer_int8_resnet50 ${INT8_RESNET50_INSTALL_DIR} analyzer_int8_resnet50_tester.cc SERIAL)
There was a problem hiding this comment.
How do I pass it --infer_data=${INT8_RESNET50_INSTALL_DIR}/data.txt ?
its a legacy thing. I had problem passing some parameters. I can do as you point out now :)
There was a problem hiding this comment.
why should SetPasses manually here?
There was a problem hiding this comment.
The default order of passes wasn't working
There was a problem hiding this comment.
i tried without it and it works now. I can remove this
There was a problem hiding this comment.
The default order of passes wasn't working
If default order of passes wasn't working, should you fix the default order? Otherwise, it is confused for users, since they should not care about the order.
There was a problem hiding this comment.
I will see what can be done about this.
|
Hi @luotao1 , this build failed because it requires quantization core PR to work. How should we proceed? |
There was a problem hiding this comment.
Does resnet50_int8v2_data.txt.tar.gz contain 50000 images?
There was a problem hiding this comment.
No, it has only 100. I Set the same data for calibration as for testing.
There was a problem hiding this comment.
So, how do we test for 50000 images?
There was a problem hiding this comment.
We can create a file like this one but with 5000 images and change the code a little to do that.
There was a problem hiding this comment.
you can create resnet50_int8v2_full_data.txt.tar.gz with 50000 images, and change the code a litter.
There was a problem hiding this comment.
line23-34 GetPaddleDType could be put in tester_helper.h
There was a problem hiding this comment.
ok good idea. Then I can also remove it from analyzer_bert_tester.cc.
There was a problem hiding this comment.
why batch_size is 100 here? The default batch_size is 1.
There was a problem hiding this comment.
I used 100 images as calibration dataset and the same for the test set both with batch 100. I didn't consider what batch size should I use for prediction because the result should be equivalent. For quantization we only support one iteration so I had to use batch bigger than 1. Should I change something here? Is it important for a simple test?
There was a problem hiding this comment.
it is not important for this simple test. But we need to test 50000 images on batch_size=1 on local machine. Thus, please enhance the test (maybe a new PR).
There was a problem hiding this comment.
Ok, should I close this one then and open a new one with 50000 images?
Should I use first 100 of them for calibration and reuse them during testing of all 50000 images?
There was a problem hiding this comment.
No, you could go ahead this PR, since 50000 images are too large for our ci to download. Thus, we will only test 50000 images on the local machine.
There was a problem hiding this comment.
The default order of passes wasn't working
If default order of passes wasn't working, should you fix the default order? Otherwise, it is confused for users, since they should not care about the order.
There was a problem hiding this comment.
top1 acc quantized-》top1 Int8 accuracy
top1 acc reference-》top1 FP32 accuracy
There was a problem hiding this comment.
This TEST(Analyzer_int8_resnet50, quantization) is only for CompareQuantizedNativeAndAnalysis, how could we do profiler?
@sfraczek You should proceed to merge core PR at first. |
This comment has been minimized.
This comment has been minimized.
|
@luotao1 I have a problem with both my CMAKE instructions: After I run cmake and make the files are nowhere to be found. Do you know why that may be happening? |
|
cdn has some problem today. The file http://paddle-inference-dist.cdn.bcebos.com/int8/resnet50_int8v2_data.txt.tar.gz exists. |
|
But I can download it with wget. I had the same problem last week. |
|
For now I have merged quantization code because I want to see if CI would pass.
I don't want to make a hasty decision on how to resolve this. |
This is confused for users, and hard to use, could you fix them in default order? |
|
Hi, @luotao1 I have two questions:
|
Yes, it should be merged before 29th. You could use the current
|
|
@luotao1 that archive contains image files and val_list.txt and they read and process the dataset here https://github.intel.com/AIPG/paddle/blob/7c5319ba121a6d73aeba0f06ce158680b160dcc2/python/paddle/fluid/contrib/tests/test_calibration.py#L67 So I can just copy a reader (and transformer with opencv) from our C-api app or we can save a file that is roughly 4 times smaller than previously (because we save them as floats now but we can save them as uchar and convert them to float later and finish the preprocessing left which is mean subtraction and division by stddev). |
I have refined the default pass order, by adding conv batch norm passes to the list of mkl-dnn passes. That way:
|
|
test=develop
|
it is good for CI now. Works with small dataset and profiler should work with custom batch_size on 50000 images. |
|
test=develop
|
@luotao1 what is the current address of the file? I changed it to http://paddle-inference-dist.bj.bcebos.com/int8/imagenet_val_100.bin.tar.gz I thought it was shared with you. |
|
Hi @luotao1 Where should I put the python script for preprocessing and generating bin file? |
|
please put in the same location with #16515 |
There are duplicated passes, similar problem with #16174. We can fix it after May 29th. |
|
please do not push the script to this PR. @lidanqing-intel |
ok I will not, it will cause CI restart |
| const PaddlePredictor::Config *qconfig, | ||
| const std::vector<std::vector<PaddleTensor>> &inputs) { | ||
| PrintConfig(config, true); | ||
| std::vector<PaddleTensor> analysis_outputs, quantized_outputs; |
There was a problem hiding this comment.
Could you add some log like LOG(INFO) << "FP32 start..." before line470 andLOG(INFO) << "INT8 start..." before line471?

No description provided.