@@ -781,7 +777,7 @@ Users can directly test the performance through [how_to_evaluate_a_model.md](tut
| 93.84 |
- | ShuffleNetV1 1.0x |
+ ShuffleNetV1 |
Classification |
top-1 |
68.13 |
@@ -791,7 +787,7 @@ Users can directly test the performance through [how_to_evaluate_a_model.md](tut
68.13 |
67.71 |
68.11 |
- $MMCLS_DIR/configs/shufflenet_v1/shufflenet_v1_1x_b64x16_linearlr_bn_nowd_imagenet.py |
+ $MMCLS_DIR/configs/shufflenet_v1/shufflenet-v1-1x_16xb64_in1k.py |
| top-5 |
@@ -804,7 +800,7 @@ Users can directly test the performance through [how_to_evaluate_a_model.md](tut
87.80 |
- | ShuffleNetV2 1.0x |
+ ShuffleNetV2 |
Classification |
top-1 |
69.55 |
@@ -814,7 +810,7 @@ Users can directly test the performance through [how_to_evaluate_a_model.md](tut
69.54 |
69.10 |
69.54 |
- $MMCLS_DIR/configs/shufflenet_v2/shufflenet_v2_1x_b64x16_linearlr_bn_nowd_imagenet.py |
+ $MMCLS_DIR/configs/shufflenet_v2/shufflenet-v2-1x_16xb64_in1k.py |
| top-5 |
@@ -837,7 +833,7 @@ Users can directly test the performance through [how_to_evaluate_a_model.md](tut
71.87 |
70.91 |
71.84 |
- $MMEDIT_DIR/configs/restorers/real_esrgan/realesrnet_c64b23g32_12x4_lr2e-4_1000k_df2k_ost.py |
+ $MMEDIT_DIR/configs/mobilenet_v2/mobilenet-v2_8xb32_in1k.py |
| top-5 |
@@ -1819,8 +1815,8 @@ Users can directly test the performance through [how_to_evaluate_a_model.md](tut
-
### Notes
+
- As some datasets contain images with various resolutions in codebase like MMDet. The speed benchmark is gained through static configs in MMDeploy, while the performance benchmark is gained through dynamic ones.
- Some int8 performance benchmarks of TensorRT require Nvidia cards with tensor core, or the performance would drop heavily.
diff --git a/docs/zh_cn/benchmark.md b/docs/zh_cn/benchmark.md
index 3225c44fd8..2a96884ac7 100644
--- a/docs/zh_cn/benchmark.md
+++ b/docs/zh_cn/benchmark.md
@@ -1,6 +1,7 @@
## 基准
### 后端
+
CPU: ncnn, ONNXRuntime, OpenVINO
GPU: ncnn, TensorRT, PPLNN
@@ -8,6 +9,7 @@ GPU: ncnn, TensorRT, PPLNN
### 延迟基准
#### 平台
+
- Ubuntu 18.04 操作系统
- ncnn 20211208
- Cuda 11.3
@@ -16,6 +18,7 @@ GPU: ncnn, TensorRT, PPLNN
- NVIDIA tesla T4 显卡.
#### 其他设置
+
- 静态图导出
- 批次大小为 1
- 每次推理后均同步
@@ -23,12 +26,11 @@ GPU: ncnn, TensorRT, PPLNN
- 热身。 针对ncnn后端,我们热身30轮; 对于其他后端:针对分类任务,我们热身1010轮,对其他任务,我们热身10轮。
- 输入分辨率根据代码库的数据集不同而不同,除了`mmediting`,其他代码库均使用真实图片作为输入。
-
用户可以直接通过[如何测试延迟](tutorials/how_to_measure_performance_of_models.md)获得想要的速度测试结果。下面是我们环境中的测试结果:
+
MMCls
-
@@ -181,14 +183,12 @@ GPU: ncnn, TensorRT, PPLNN
-
MMDet
-
@@ -406,7 +406,6 @@ GPU: ncnn, TensorRT, PPLNN
MMEdit
-
@@ -476,7 +475,6 @@ GPU: ncnn, TensorRT, PPLNN
-
@@ -569,7 +567,6 @@ GPU: ncnn, TensorRT, PPLNN
MMSeg
-
@@ -674,7 +671,6 @@ GPU: ncnn, TensorRT, PPLNN
-
@@ -686,7 +682,6 @@ GPU: ncnn, TensorRT, PPLNN
MMCls
-
@@ -783,7 +778,7 @@ GPU: ncnn, TensorRT, PPLNN
| 93.84 |
- | ShuffleNetV1 1.0x |
+ ShuffleNetV1 |
Classification |
top-1 |
68.13 |
@@ -793,7 +788,7 @@ GPU: ncnn, TensorRT, PPLNN
68.13 |
67.71 |
68.11 |
- $MMCLS_DIR/configs/shufflenet_v1/shufflenet_v1_1x_b64x16_linearlr_bn_nowd_imagenet.py |
+ $MMCLS_DIR/configs/shufflenet_v1/shufflenet-v1-1x_16xb64_in1k.py |
| top-5 |
@@ -806,7 +801,7 @@ GPU: ncnn, TensorRT, PPLNN
87.80 |
- | ShuffleNetV2 1.0x |
+ ShuffleNetV2 |
Classification |
top-1 |
69.55 |
@@ -816,7 +811,7 @@ GPU: ncnn, TensorRT, PPLNN
69.54 |
69.10 |
69.54 |
- $MMCLS_DIR/configs/shufflenet_v2/shufflenet_v2_1x_b64x16_linearlr_bn_nowd_imagenet.py |
+ $MMCLS_DIR/configs/shufflenet_v2/shufflenet-v2-1x_16xb64_in1k.py |
| top-5 |
@@ -839,7 +834,7 @@ GPU: ncnn, TensorRT, PPLNN
71.87 |
70.91 |
71.84 |
- $MMEDIT_DIR/configs/restorers/real_esrgan/realesrnet_c64b23g32_12x4_lr2e-4_1000k_df2k_ost.py |
+ $MMEDIT_DIR/configs/mobilenet_v2/mobilenet-v2_8xb32_in1k.py |
| top-5 |
@@ -1807,8 +1802,8 @@ GPU: ncnn, TensorRT, PPLNN
-
### 注意
+
- 由于某些数据集在代码库中包含各种分辨率的图像,例如 MMDet,速度基准是通过 MMDeploy 中的静态配置获得的,而性能基准是通过动态配置获得的。
- TensorRT 的一些 int8 性能基准测试需要具有 tensor core 的 Nvidia 卡,否则性能会大幅下降。