Skip to content

fc_op slows on multi-instance inference #17153

@luotao1

Description

@luotao1

When profile multi-instance inference on pyramid_dnn, we find fc_op slows on it.

command:

  • one instance:
./paddle/fluid/inference/tests/api/test_analyzer_pyramid_dnn --infer_model=third_party/inference_demo/pyramid_dnn/model/ --infer_data=third_party/inference_demo/pyramid_dnn/data.txt --gtest_filter=Analyzer_Pyramid_DNN.profile --paddle_num_threads=1 --repeat=10000 --zero_copy --warmup --num_threads=1
  • two instance:
./paddle/fluid/inference/tests/api/test_analyzer_pyramid_dnn --infer_model=third_party/inference_demo/pyramid_dnn/model/ --infer_data=third_party/inference_demo/pyramid_dnn/data.txt --gtest_filter=Analyzer_Pyramid_DNN.profile --paddle_num_threads=1 --repeat=10000 --zero_copy --warmup --num_threads=2

result (latency)

fc_op remove fc_op, i.e use mul+add
1 threads 0.13118 0.141566
2 threads 0.291523 0.152074

The remove fc_op codes are:

--- a/paddle/fluid/inference/tests/api/analyzer_pyramid_dnn_tester.cc
+++ b/paddle/fluid/inference/tests/api/analyzer_pyramid_dnn_tester.cc
@@ -110,6 +110,7 @@ void SetConfig(AnalysisConfig *cfg) {
   if (FLAGS_zero_copy) {
     cfg->SwitchUseFeedFetchOps(false);
   }
+  cfg->pass_builder()->DeletePass("fc_fuse_pass");
 }

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions