ytfeng commited on
Commit
d2b2b68
·
1 Parent(s): fefbdcf

Refactor benchmark (#148)

Browse files

* use mean as default for benchmark metric; change result representation;
add --all for benchmarking all configs at a time

* fix comments

* add --model_exclude

* pretty print

* improve benchmark result table header: from band-xpu to xpu-band

* suppress print message

* update benchmark results on CPU-RPI

* add the new benchmark results on the new intel cpu

* fix backend and target setting in benchmark; pre-modify the names of int8 quantized models

* add results on jetson cpu

* add cuda results

* print target and backend when using --all

* add results on Khadas VIM3

* pretty print results

* true pretty print results

* update results in new format

* fix broken backend and target vars

* fix broken backend and target vars

* fix broken backend and target var

* update benchmark results on many devices

* add db results on Ascend-310

* update info on CPU-INTEL

* update usage of the new benchmark script

README.md CHANGED
@@ -21,43 +21,43 @@ Guidelines:
21
 
22
  ## Models & Benchmark Results
23
 
24
- | Model | Task | Input Size | INTEL-CPU (ms) | RPI-CPU (ms) | JETSON-GPU (ms) | KV3-NPU (ms) | Ascend-310 (ms) | D1-CPU (ms) |
25
- | ------------------------------------------------------- | ----------------------------- | ---------- | -------------- | ------------ | --------------- | ------------ | --------------- | ----------- |
26
- | [YuNet](./models/face_detection_yunet) | Face Detection | 160x120 | 1.45 | 6.22 | 12.18 | 4.04 | 1.73 | 86.69 |
27
- | [SFace](./models/face_recognition_sface) | Face Recognition | 112x112 | 8.65 | 99.20 | 24.88 | 46.25 | 23.17 | --- |
28
- | [FER](./models/facial_expression_recognition/) | Facial Expression Recognition | 112x112 | 4.43 | 49.86 | 31.07 | 29.80 | 10.12 | --- |
29
- | [LPD-YuNet](./models/license_plate_detection_yunet/) | License Plate Detection | 320x240 | --- | 168.03 | 56.12 | 29.53 | 8.70 | --- |
30
- | [YOLOX](./models/object_detection_yolox/) | Object Detection | 640x640 | 176.68 | 1496.70 | 388.95 | 420.98 | 29.10 | --- |
31
- | [NanoDet](./models/object_detection_nanodet/) | Object Detection | 416x416 | 157.91 | 220.36 | 64.94 | 116.64 | 35.97 | --- |
32
- | [DB-IC15](./models/text_detection_db) | Text Detection | 640x480 | 142.91 | 2835.91 | 208.41 | --- | 229.74 | --- |
33
- | [DB-TD500](./models/text_detection_db) | Text Detection | 640x480 | 142.91 | 2841.71 | 210.51 | --- | 247.29 | --- |
34
- | [CRNN-EN](./models/text_recognition_crnn) | Text Recognition | 100x32 | 50.21 | 234.32 | 196.15 | 125.30 | 101.03 | --- |
35
- | [CRNN-CN](./models/text_recognition_crnn) | Text Recognition | 100x32 | 73.52 | 322.16 | 239.76 | 166.79 | 136.41 | --- |
36
- | [PP-ResNet](./models/image_classification_ppresnet) | Image Classification | 224x224 | 56.05 | 602.58 | 98.64 | 75.45 | 6.99 | --- |
37
- | [MobileNet-V1](./models/image_classification_mobilenet) | Image Classification | 224x224 | 9.04 | 92.25 | 33.18 | 145.66\* | 5.25 | --- |
38
- | [MobileNet-V2](./models/image_classification_mobilenet) | Image Classification | 224x224 | 8.86 | 74.03 | 31.92 | 146.31\* | 5.82 | --- |
39
- | [PP-HumanSeg](./models/human_segmentation_pphumanseg) | Human Segmentation | 192x192 | 19.92 | 105.32 | 67.97 | 74.77 | 7.07 | --- |
40
- | [WeChatQRCode](./models/qrcode_wechatqrcode) | QR Code Detection and Parsing | 100x100 | 7.04 | 37.68 | --- | --- | --- | --- |
41
- | [DaSiamRPN](./models/object_tracking_dasiamrpn) | Object Tracking | 1280x720 | 36.15 | 705.48 | 76.82 | --- | --- | --- |
42
- | [YoutuReID](./models/person_reid_youtureid) | Person Re-Identification | 128x256 | 35.81 | 521.98 | 90.07 | 44.61 | 5.69 | --- |
43
- | [MP-PalmDet](./models/palm_detection_mediapipe) | Palm Detection | 192x192 | 11.09 | 63.79 | 83.20 | 33.81 | 21.59 | --- |
44
- | [MP-HandPose](./models/handpose_estimation_mediapipe) | Hand Pose Estimation | 224x224 | 4.28 | 36.19 | 40.10 | 19.47 | 6.02 | --- |
45
 
46
  \*: Models are quantized in per-channel mode, which run slower than per-tensor quantized models on NPU.
47
 
48
  Hardware Setup:
49
 
50
- - `INTEL-CPU`: [Intel Core i7-5930K](https://www.intel.com/content/www/us/en/products/sku/82931/intel-core-i75930k-processor-15m-cache-up-to-3-70-ghz/specifications.html) @ 3.50GHz, 6 cores, 12 threads.
51
- - `RPI-CPU`: [Raspberry Pi 4B](https://www.raspberrypi.com/products/raspberry-pi-4-model-b/specifications/), Broadcom BCM2711, Quad core Cortex-A72 (ARM v8) 64-bit SoC @ 1.5GHz.
52
- - `JETSON-GPU`: [NVIDIA Jetson Nano B01](https://developer.nvidia.com/embedded/jetson-nano-developer-kit), 128-core NVIDIA Maxwell GPU.
53
- - `KV3-NPU`: [Khadas VIM3](https://www.khadas.com/vim3), 5TOPS Performance. Benchmarks are done using **quantized** models. You will need to compile OpenCV with TIM-VX following [this guide](https://github.com/opencv/opencv/wiki/TIM-VX-Backend-For-Running-OpenCV-On-NPU) to run benchmarks. The test results use the `per-tensor` quantization model by default.
54
- - `Ascend-310`: [Ascend 310](https://e.huawei.com/uk/products/cloud-computing-dc/atlas/ascend-310), 22 TOPS@INT8. Benchmarks are done on [Atlas 200 DK AI Developer Kit](https://e.huawei.com/in/products/cloud-computing-dc/atlas/atlas-200). Get the latest OpenCV source code and build following [this guide](https://github.com/opencv/opencv/wiki/Huawei-CANN-Backend) to enable CANN backend.
55
- - `D1-CPU`: [Allwinner D1](https://d1.docs.aw-ol.com/en), [Xuantie C906 CPU](https://www.t-head.cn/product/C906?spm=a2ouz.12986968.0.0.7bfc1384auGNPZ) (RISC-V, RVV 0.7.1) @ 1.0GHz, 1 core. YuNet is supported for now. Visit [here](https://github.com/fengyuentau/opencv_zoo_cpp) for more details.
56
 
57
  ***Important Notes***:
58
 
59
  - The data under each column of hardware setups on the above table represents the elapsed time of an inference (preprocess, forward and postprocess).
60
- - The time data is the median of 10 runs after some warmup runs. Different metrics may be applied to some specific models.
61
  - Batch size is 1 for all benchmark results.
62
  - `---` represents the model is not availble to run on the device.
63
  - View [benchmark/config](./benchmark/config) for more details on benchmarking different models.
 
21
 
22
  ## Models & Benchmark Results
23
 
24
+ | Model | Task | Input Size | CPU-INTEL (ms) | CPU-RPI (ms) | GPU-JETSON (ms) | NPU-KV3 (ms) | NPU-Ascend310 (ms) | CPU-D1 (ms) |
25
+ | ------------------------------------------------------- | ----------------------------- | ---------- | -------------- | ------------ | --------------- | ------------ | ------------------ | ----------- |
26
+ | [YuNet](./models/face_detection_yunet) | Face Detection | 160x120 | 0.72 | 5.43 | 12.18 | 4.04 | 2.24 | 86.69 |
27
+ | [SFace](./models/face_recognition_sface) | Face Recognition | 112x112 | 6.04 | 78.83 | 24.88 | 46.25 | 2.66 | --- |
28
+ | [FER](./models/facial_expression_recognition/) | Facial Expression Recognition | 112x112 | 3.16 | 32.53 | 31.07 | 29.80 | 2.19 | --- |
29
+ | [LPD-YuNet](./models/license_plate_detection_yunet/) | License Plate Detection | 320x240 | 8.63 | 167.70 | 56.12 | 29.53 | 7.63 | --- |
30
+ | [YOLOX](./models/object_detection_yolox/) | Object Detection | 640x640 | 141.20 | 1805.87 | 388.95 | 420.98 | 28.59 | --- |
31
+ | [NanoDet](./models/object_detection_nanodet/) | Object Detection | 416x416 | 66.03 | 225.10 | 64.94 | 116.64 | 20.62 | --- |
32
+ | [DB-IC15](./models/text_detection_db) (EN) | Text Detection | 640x480 | 71.03 | 1862.75 | 208.41 | --- | 17.15 | --- |
33
+ | [DB-TD500](./models/text_detection_db) (EN&CN) | Text Detection | 640x480 | 72.31 | 1878.45 | 210.51 | --- | 17.95 | --- |
34
+ | [CRNN-EN](./models/text_recognition_crnn) | Text Recognition | 100x32 | 20.16 | 278.11 | 196.15 | 125.30 | --- | --- |
35
+ | [CRNN-CN](./models/text_recognition_crnn) | Text Recognition | 100x32 | 23.07 | 297.48 | 239.76 | 166.79 | --- | --- |
36
+ | [PP-ResNet](./models/image_classification_ppresnet) | Image Classification | 224x224 | 34.71 | 463.93 | 98.64 | 75.45 | 6.99 | --- |
37
+ | [MobileNet-V1](./models/image_classification_mobilenet) | Image Classification | 224x224 | 5.90 | 72.33 | 33.18 | 145.66\* | 5.15 | --- |
38
+ | [MobileNet-V2](./models/image_classification_mobilenet) | Image Classification | 224x224 | 5.97 | 66.56 | 31.92 | 146.31\* | 5.41 | --- |
39
+ | [PP-HumanSeg](./models/human_segmentation_pphumanseg) | Human Segmentation | 192x192 | 8.81 | 73.13 | 67.97 | 74.77 | 6.94 | --- |
40
+ | [WeChatQRCode](./models/qrcode_wechatqrcode) | QR Code Detection and Parsing | 100x100 | 1.29 | 5.71 | --- | --- | --- | --- |
41
+ | [DaSiamRPN](./models/object_tracking_dasiamrpn) | Object Tracking | 1280x720 | 29.05 | 712.94 | 76.82 | --- | --- | --- |
42
+ | [YoutuReID](./models/person_reid_youtureid) | Person Re-Identification | 128x256 | 30.39 | 625.56 | 90.07 | 44.61 | 5.58 | --- |
43
+ | [MP-PalmDet](./models/palm_detection_mediapipe) | Palm Detection | 192x192 | 6.29 | 86.83 | 83.20 | 33.81 | 5.17 | --- |
44
+ | [MP-HandPose](./models/handpose_estimation_mediapipe) | Hand Pose Estimation | 224x224 | 4.68 | 43.57 | 40.10 | 19.47 | 6.27 | --- |
45
 
46
  \*: Models are quantized in per-channel mode, which run slower than per-tensor quantized models on NPU.
47
 
48
  Hardware Setup:
49
 
50
+ - `CPU-INTEL`: [Intel Core i7-12700K](https://www.intel.com/content/www/us/en/products/sku/134594/intel-core-i712700k-processor-25m-cache-up-to-5-00-ghz/specifications.html), 8 Performance-cores (3.60 GHz, turbo up to 4.90 GHz), 4 Efficient-cores (2.70 GHz, turbo up to 3.80 GHz), 20 threads.
51
+ - `CPU-RPI`: [Raspberry Pi 4B](https://www.raspberrypi.com/products/raspberry-pi-4-model-b/specifications/), Broadcom BCM2711, Quad core Cortex-A72 (ARM v8) 64-bit SoC @ 1.5 GHz.
52
+ - `GPU-JETSON`: [NVIDIA Jetson Nano B01](https://developer.nvidia.com/embedded/jetson-nano-developer-kit), 128-core NVIDIA Maxwell GPU.
53
+ - `NPU-KV3`: [Khadas VIM3](https://www.khadas.com/vim3), 5TOPS Performance. Benchmarks are done using **quantized** models. You will need to compile OpenCV with TIM-VX following [this guide](https://github.com/opencv/opencv/wiki/TIM-VX-Backend-For-Running-OpenCV-On-NPU) to run benchmarks. The test results use the `per-tensor` quantization model by default.
54
+ - `NPU-Ascend310`: [Ascend 310](https://e.huawei.com/uk/products/cloud-computing-dc/atlas/atlas-200), 22 TOPS @ INT8. Benchmarks are done on [Atlas 200 DK AI Developer Kit](https://e.huawei.com/in/products/cloud-computing-dc/atlas/atlas-200). Get the latest OpenCV source code and build following [this guide](https://github.com/opencv/opencv/wiki/Huawei-CANN-Backend) to enable CANN backend.
55
+ - `CPU-D1`: [Allwinner D1](https://d1.docs.aw-ol.com/en), [Xuantie C906 CPU](https://www.t-head.cn/product/C906?spm=a2ouz.12986968.0.0.7bfc1384auGNPZ) (RISC-V, RVV 0.7.1) @ 1.0 GHz, 1 core. YuNet is supported for now. Visit [here](https://github.com/fengyuentau/opencv_zoo_cpp) for more details.
56
 
57
  ***Important Notes***:
58
 
59
  - The data under each column of hardware setups on the above table represents the elapsed time of an inference (preprocess, forward and postprocess).
60
+ - The time data is the mean of 10 runs after some warmup runs. Different metrics may be applied to some specific models.
61
  - Batch size is 1 for all benchmark results.
62
  - `---` represents the model is not availble to run on the device.
63
  - View [benchmark/config](./benchmark/config) for more details on benchmarking different models.
benchmark/README.md CHANGED
@@ -19,7 +19,25 @@ Data for benchmarking will be downloaded and loaded in [data](./data) based on g
19
 
20
  ```shell
21
  export PYTHONPATH=$PYTHONPATH:..
 
 
22
  python benchmark.py --cfg ./config/face_detection_yunet.yaml
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
23
  ```
24
 
25
  **Windows**:
@@ -34,9 +52,377 @@ python benchmark.py --cfg ./config/face_detection_yunet.yaml
34
  $env:PYTHONPATH=$env:PYTHONPATH+";.."
35
  python benchmark.py --cfg ./config/face_detection_yunet.yaml
36
  ```
37
- <!--
38
- Omit `--cfg` if you want to benchmark all included models:
39
- ```shell
40
- PYTHONPATH=.. python benchmark.py
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
41
  ```
42
- -->
 
19
 
20
  ```shell
21
  export PYTHONPATH=$PYTHONPATH:..
22
+
23
+ # Single config
24
  python benchmark.py --cfg ./config/face_detection_yunet.yaml
25
+
26
+ # All configs
27
+ python benchmark.py --all
28
+
29
+ # All configs but only fp32 models (--fp32, --fp16, --int8 are available for now)
30
+ python benchmark.py --all --fp32
31
+
32
+ # All configs but exclude some of them (fill with config name keywords, not sensitive to upper/lower case, seperate with colons)
33
+ python benchmark.py --all --cfg_exclude wechat
34
+ python benchmark.py --all --cfg_exclude wechat:dasiamrpn
35
+
36
+ # All configs but exclude some of the models (fill with exact model names, sensitive to upper/lower case, seperate with colons)
37
+ python benchmark.py --all --model_exclude license_plate_detection_lpd_yunet_2023mar_int8.onnx:human_segmentation_pphumanseg_2023mar_int8.onnx
38
+
39
+ # All configs with overwritten backend and target (run with --help to get available combinations)
40
+ python benchmark.py --all --cfg_overwrite_backend_target 1
41
  ```
42
 
43
  **Windows**:
 
52
  $env:PYTHONPATH=$env:PYTHONPATH+";.."
53
  python benchmark.py --cfg ./config/face_detection_yunet.yaml
54
  ```
55
+
56
+ ## Detailed Results
57
+
58
+ Benchmark is done with latest `opencv-python==4.7.0.72` and `opencv-contrib-python==4.7.0.72` on the following platforms. Some models are excluded because of support issues.
59
+
60
+ ### Intel 12700K
61
+
62
+ Specs: [details](https://www.intel.com/content/www/us/en/products/sku/134594/intel-core-i712700k-processor-25m-cache-up-to-5-00-ghz/specifications.html)
63
+ - CPU: 8 Performance-cores, 4 Efficient-cores, 20 threads
64
+ - Performance-core: 3.60 GHz base freq, turbo up to 4.90 GHz
65
+ - Efficient-core: 2.70 GHz base freq, turbo up to 3.80 GHz
66
+
67
+ CPU:
68
+
69
+ ```
70
+ $ python benchmark.py --all --model_exclude license_plate_detection_lpd_yunet_2023mar_int8.onnx:human_segmentation_pphumanseg_2023mar_int8.onnx
71
+ Benchmarking ...
72
+ backend=cv.dnn.DNN_BACKEND_OPENCV
73
+ target=cv.dnn.DNN_TARGET_CPU
74
+ mean median min input size model
75
+ 0.58 0.67 0.48 [160, 120] YuNet with ['face_detection_yunet_2022mar.onnx']
76
+ 0.82 0.81 0.48 [160, 120] YuNet with ['face_detection_yunet_2022mar_int8.onnx']
77
+ 6.18 6.33 5.83 [150, 150] SFace with ['face_recognition_sface_2021dec.onnx']
78
+ 7.42 7.42 5.83 [150, 150] SFace with ['face_recognition_sface_2021dec_int8.onnx']
79
+ 3.32 3.46 2.76 [112, 112] FacialExpressionRecog with ['facial_expression_recognition_mobilefacenet_2022july.onnx']
80
+ 4.27 4.22 2.76 [112, 112] FacialExpressionRecog with ['facial_expression_recognition_mobilefacenet_2022july_int8.onnx']
81
+ 4.68 5.04 4.36 [224, 224] MPHandPose with ['handpose_estimation_mediapipe_2023feb.onnx']
82
+ 4.82 4.98 4.36 [224, 224] MPHandPose with ['handpose_estimation_mediapipe_2023feb_int8.onnx']
83
+ 8.20 9.33 6.66 [192, 192] PPHumanSeg with ['human_segmentation_pphumanseg_2023mar.onnx']
84
+ 6.25 7.02 5.49 [224, 224] MobileNet with ['image_classification_mobilenetv1_2022apr.onnx']
85
+ 6.00 6.31 5.49 [224, 224] MobileNet with ['image_classification_mobilenetv2_2022apr.onnx']
86
+ 6.23 5.64 5.49 [224, 224] MobileNet with ['image_classification_mobilenetv1_2022apr_int8.onnx']
87
+ 6.50 6.87 5.49 [224, 224] MobileNet with ['image_classification_mobilenetv2_2022apr_int8.onnx']
88
+ 35.40 36.58 33.63 [224, 224] PPResNet with ['image_classification_ppresnet50_2022jan.onnx']
89
+ 35.79 35.53 33.48 [224, 224] PPResNet with ['image_classification_ppresnet50_2022jan_int8.onnx']
90
+ 8.53 8.59 7.55 [320, 240] LPD_YuNet with ['license_plate_detection_lpd_yunet_2023mar.onnx']
91
+ 65.15 77.44 45.40 [416, 416] NanoDet with ['object_detection_nanodet_2022nov.onnx']
92
+ 58.82 69.99 45.26 [416, 416] NanoDet with ['object_detection_nanodet_2022nov_int8.onnx']
93
+ 137.53 136.70 119.95 [640, 640] YoloX with ['object_detection_yolox_2022nov.onnx']
94
+ 139.60 147.79 119.95 [640, 640] YoloX with ['object_detection_yolox_2022nov_int8.onnx']
95
+ 29.46 42.21 25.82 [1280, 720] DaSiamRPN with ['object_tracking_dasiamrpn_kernel_cls1_2021nov.onnx', 'object_tracking_dasiamrpn_kernel_r1_2021nov.onnx', 'object_tracking_dasiamrpn_model_2021nov.onnx']
96
+ 6.14 6.02 5.91 [192, 192] MPPalmDet with ['palm_detection_mediapipe_2023feb.onnx']
97
+ 8.51 9.89 5.91 [192, 192] MPPalmDet with ['palm_detection_mediapipe_2023feb_int8.onnx']
98
+ 30.87 30.69 29.85 [128, 256] YoutuReID with ['person_reid_youtu_2021nov.onnx']
99
+ 30.77 30.02 27.97 [128, 256] YoutuReID with ['person_reid_youtu_2021nov_int8.onnx']
100
+ 1.35 1.37 1.30 [100, 100] WeChatQRCode with ['detect_2021nov.prototxt', 'detect_2021nov.caffemodel', 'sr_2021nov.prototxt', 'sr_2021nov.caffemodel']
101
+ 75.82 75.37 69.18 [640, 480] DB with ['text_detection_DB_IC15_resnet18_2021sep.onnx']
102
+ 74.80 75.16 69.05 [640, 480] DB with ['text_detection_DB_TD500_resnet18_2021sep.onnx']
103
+ 21.37 24.50 16.04 [1280, 720] CRNN with ['text_recognition_CRNN_CH_2021sep.onnx']
104
+ 23.08 25.14 16.04 [1280, 720] CRNN with ['text_recognition_CRNN_CN_2021nov.onnx']
105
+ 20.43 31.14 11.74 [1280, 720] CRNN with ['text_recognition_CRNN_EN_2021sep.onnx']
106
+ [ WARN:[email protected]] global onnx_graph_simplifier.cpp:804 getMatFromTensor DNN: load FP16 model as FP32 model, and it takes twice the FP16 RAM requirement.
107
+ 20.71 17.95 11.74 [1280, 720] CRNN with ['text_recognition_CRNN_CH_2023feb_fp16.onnx']
108
+ 19.48 25.14 11.74 [1280, 720] CRNN with ['text_recognition_CRNN_EN_2023feb_fp16.onnx']
109
+ 19.38 18.85 11.74 [1280, 720] CRNN with ['text_recognition_CRNN_CH_2022oct_int8.onnx']
110
+ 19.52 25.97 11.74 [1280, 720] CRNN with ['text_recognition_CRNN_CN_2021nov_int8.onnx']
111
+ 18.55 15.29 10.35 [1280, 720] CRNN with ['text_recognition_CRNN_EN_2022oct_int8.onnx']
112
+ ```
113
+
114
+ ### Rasberry Pi 4B
115
+
116
+ Specs: [details](https://www.raspberrypi.com/products/raspberry-pi-4-model-b/specifications/)
117
+ - CPU: Broadcom BCM2711, Quad core Cortex-A72 (ARM v8) 64-bit SoC @ 1.5 GHz.
118
+
119
+ CPU:
120
+
121
+ ```
122
+ $ python benchmark.py --all --model_exclude license_plate_detection_lpd_yunet_2023mar_int8.onnx:human_segmentation_pphumanseg_2023mar_int8.onnx
123
+ Benchmarking ...
124
+ backend=cv.dnn.DNN_BACKEND_OPENCV
125
+ target=cv.dnn.DNN_TARGET_CPU
126
+ mean median min input size model
127
+ 5.45 5.44 5.39 [160, 120] YuNet with ['face_detection_yunet_2022mar.onnx']
128
+ 6.12 6.15 5.39 [160, 120] YuNet with ['face_detection_yunet_2022mar_int8.onnx']
129
+ 78.04 77.96 77.62 [150, 150] SFace with ['face_recognition_sface_2021dec.onnx']
130
+ 91.44 93.03 77.62 [150, 150] SFace with ['face_recognition_sface_2021dec_int8.onnx']
131
+ 32.21 31.86 31.85 [112, 112] FacialExpressionRecog with ['facial_expression_recognition_mobilefacenet_2022july.onnx']
132
+ 38.22 39.27 31.85 [112, 112] FacialExpressionRecog with ['facial_expression_recognition_mobilefacenet_2022july_int8.onnx']
133
+ 43.85 43.76 43.51 [224, 224] MPHandPose with ['handpose_estimation_mediapipe_2023feb.onnx']
134
+ 46.66 47.00 43.51 [224, 224] MPHandPose with ['handpose_estimation_mediapipe_2023feb_int8.onnx']
135
+ 73.29 73.70 72.86 [192, 192] PPHumanSeg with ['human_segmentation_pphumanseg_2023mar.onnx']
136
+ 74.51 87.71 73.83 [224, 224] MobileNet with ['image_classification_mobilenetv1_2022apr.onnx']
137
+ 67.29 68.22 61.55 [224, 224] MobileNet with ['image_classification_mobilenetv2_2022apr.onnx']
138
+ 68.53 61.77 61.55 [224, 224] MobileNet with ['image_classification_mobilenetv1_2022apr_int8.onnx']
139
+ 68.31 72.16 61.55 [224, 224] MobileNet with ['image_classification_mobilenetv2_2022apr_int8.onnx']
140
+ 547.70 547.68 494.91 [224, 224] PPResNet with ['image_classification_ppresnet50_2022jan.onnx']
141
+ 527.14 567.06 465.02 [224, 224] PPResNet with ['image_classification_ppresnet50_2022jan_int8.onnx']
142
+ 192.61 194.08 156.62 [320, 240] LPD_YuNet with ['license_plate_detection_lpd_yunet_2023mar.onnx']
143
+ 248.03 229.41 209.65 [416, 416] NanoDet with ['object_detection_nanodet_2022nov.onnx']
144
+ 246.41 247.64 207.91 [416, 416] NanoDet with ['object_detection_nanodet_2022nov_int8.onnx']
145
+ 1932.97 1941.47 1859.96 [640, 640] YoloX with ['object_detection_yolox_2022nov.onnx']
146
+ 1866.98 1866.50 1746.67 [640, 640] YoloX with ['object_detection_yolox_2022nov_int8.onnx']
147
+ 762.56 738.04 654.25 [1280, 720] DaSiamRPN with ['object_tracking_dasiamrpn_kernel_cls1_2021nov.onnx', 'object_tracking_dasiamrpn_kernel_r1_2021nov.onnx', 'object_tracking_dasiamrpn_model_2021nov.onnx']
148
+ 91.48 91.28 91.15 [192, 192] MPPalmDet with ['palm_detection_mediapipe_2023feb.onnx']
149
+ 115.58 135.17 91.15 [192, 192] MPPalmDet with ['palm_detection_mediapipe_2023feb_int8.onnx']
150
+ 676.15 655.20 636.06 [128, 256] YoutuReID with ['person_reid_youtu_2021nov.onnx']
151
+ 548.93 582.29 443.32 [128, 256] YoutuReID with ['person_reid_youtu_2021nov_int8.onnx']
152
+ 8.18 8.15 8.13 [100, 100] WeChatQRCode with ['detect_2021nov.prototxt', 'detect_2021nov.caffemodel', 'sr_2021nov.prototxt', 'sr_2021nov.caffemodel']
153
+ 2025.09 2046.92 1971.57 [640, 480] DB with ['text_detection_DB_IC15_resnet18_2021sep.onnx']
154
+ 2041.85 2048.24 1971.57 [640, 480] DB with ['text_detection_DB_TD500_resnet18_2021sep.onnx']
155
+ 272.81 285.66 259.93 [1280, 720] CRNN with ['text_recognition_CRNN_CH_2021sep.onnx']
156
+ 293.83 289.93 259.93 [1280, 720] CRNN with ['text_recognition_CRNN_CN_2021nov.onnx']
157
+ 271.57 317.17 223.36 [1280, 720] CRNN with ['text_recognition_CRNN_EN_2021sep.onnx']
158
+ [ WARN:[email protected]] global onnx_graph_simplifier.cpp:804 getMatFromTensor DNN: load FP16 model as FP32 model, and it takes twice the FP16 RAM requirement.
159
+ 266.67 269.64 223.36 [1280, 720] CRNN with ['text_recognition_CRNN_CH_2023feb_fp16.onnx']
160
+ 259.06 239.43 223.36 [1280, 720] CRNN with ['text_recognition_CRNN_EN_2023feb_fp16.onnx']
161
+ 251.39 257.43 221.20 [1280, 720] CRNN with ['text_recognition_CRNN_CH_2022oct_int8.onnx']
162
+ 248.27 253.01 221.20 [1280, 720] CRNN with ['text_recognition_CRNN_CN_2021nov_int8.onnx']
163
+ 239.42 238.72 190.04 [1280, 720] CRNN with ['text_recognition_CRNN_EN_2022oct_int8.onnx']
164
+ ```
165
+
166
+ ### Jetson Nano B01
167
+
168
+ Specs: [details](https://developer.nvidia.com/embedded/jetson-nano-developer-kit)
169
+ - CPU: Quad-core ARM A57 @ 1.43 GHz
170
+ - GPU: 128-core NVIDIA Maxwell
171
+
172
+ CPU:
173
+
174
+ ```
175
+ $ python3 benchmark.py --all --model_exclude license_plate_detection_lpd_yunet_2023mar_int8.onnx:human_segmentation_pphumanseg_2023mar_int8.onnx
176
+ Benchmarking ...
177
+ backend=cv.dnn.DNN_BACKEND_OPENCV
178
+ target=cv.dnn.DNN_TARGET_CPU
179
+ mean median min input size model
180
+ 5.37 5.44 5.27 [160, 120] YuNet with ['face_detection_yunet_2022mar.onnx']
181
+ 6.11 7.99 5.27 [160, 120] YuNet with ['face_detection_yunet_2022mar_int8.onnx']
182
+ 65.14 65.13 64.93 [150, 150] SFace with ['face_recognition_sface_2021dec.onnx']
183
+ 79.33 88.12 64.93 [150, 150] SFace with ['face_recognition_sface_2021dec_int8.onnx']
184
+ 28.19 28.17 28.05 [112, 112] FacialExpressionRecog with ['facial_expression_recognition_mobilefacenet_2022july.onnx']
185
+ 34.85 35.66 28.05 [112, 112] FacialExpressionRecog with ['facial_expression_recognition_mobilefacenet_2022july_int8.onnx']
186
+ 41.02 42.37 40.80 [224, 224] MPHandPose with ['handpose_estimation_mediapipe_2023feb.onnx']
187
+ 44.20 44.39 40.80 [224, 224] MPHandPose with ['handpose_estimation_mediapipe_2023feb_int8.onnx']
188
+ 65.91 65.93 65.68 [192, 192] PPHumanSeg with ['human_segmentation_pphumanseg_2023mar.onnx']
189
+ 68.94 68.95 68.77 [224, 224] MobileNet with ['image_classification_mobilenetv1_2022apr.onnx']
190
+ 62.12 62.24 55.29 [224, 224] MobileNet with ['image_classification_mobilenetv2_2022apr.onnx']
191
+ 66.04 55.58 55.29 [224, 224] MobileNet with ['image_classification_mobilenetv1_2022apr_int8.onnx']
192
+ 65.31 64.86 55.29 [224, 224] MobileNet with ['image_classification_mobilenetv2_2022apr_int8.onnx']
193
+ 376.88 368.22 367.11 [224, 224] PPResNet with ['image_classification_ppresnet50_2022jan.onnx']
194
+ 390.32 385.28 367.11 [224, 224] PPResNet with ['image_classification_ppresnet50_2022jan_int8.onnx']
195
+ 133.15 130.57 129.38 [320, 240] LPD_YuNet with ['license_plate_detection_lpd_yunet_2023mar.onnx']
196
+ 215.57 225.11 212.66 [416, 416] NanoDet with ['object_detection_nanodet_2022nov.onnx']
197
+ 217.37 214.85 212.66 [416, 416] NanoDet with ['object_detection_nanodet_2022nov_int8.onnx']
198
+ 1228.13 1233.90 1219.11 [640, 640] YoloX with ['object_detection_yolox_2022nov.onnx']
199
+ 1257.34 1256.26 1219.11 [640, 640] YoloX with ['object_detection_yolox_2022nov_int8.onnx']
200
+ 466.19 457.89 442.88 [1280, 720] DaSiamRPN with ['object_tracking_dasiamrpn_kernel_cls1_2021nov.onnx', 'object_tracking_dasiamrpn_kernel_r1_2021nov.onnx', 'object_tracking_dasiamrpn_model_2021nov.onnx']
201
+ 69.60 69.69 69.13 [192, 192] MPPalmDet with ['palm_detection_mediapipe_2023feb.onnx']
202
+ 81.65 82.20 69.13 [192, 192] MPPalmDet with ['palm_detection_mediapipe_2023feb_int8.onnx']
203
+ 411.49 417.53 402.57 [128, 256] YoutuReID with ['person_reid_youtu_2021nov.onnx']
204
+ 372.94 370.17 335.95 [128, 256] YoutuReID with ['person_reid_youtu_2021nov_int8.onnx']
205
+ 5.62 5.64 5.55 [100, 100] WeChatQRCode with ['detect_2021nov.prototxt', 'detect_2021nov.caffemodel', 'sr_2021nov.prototxt', 'sr_2021nov.caffemodel']
206
+ 1089.89 1091.85 1071.95 [640, 480] DB with ['text_detection_DB_IC15_resnet18_2021sep.onnx']
207
+ 1089.94 1095.07 1071.95 [640, 480] DB with ['text_detection_DB_TD500_resnet18_2021sep.onnx']
208
+ 274.45 286.03 270.52 [1280, 720] CRNN with ['text_recognition_CRNN_CH_2021sep.onnx']
209
+ 290.82 288.87 270.52 [1280, 720] CRNN with ['text_recognition_CRNN_CN_2021nov.onnx']
210
+ 269.52 311.59 228.47 [1280, 720] CRNN with ['text_recognition_CRNN_EN_2021sep.onnx']
211
+ [ WARN:[email protected]] global onnx_graph_simplifier.cpp:804 getMatFromTensor DNN: load FP16 model as FP32 model, and it takes twice the FP16 RAM requirement.
212
+ 269.66 267.98 228.47 [1280, 720] CRNN with ['text_recognition_CRNN_CH_2023feb_fp16.onnx']
213
+ 261.39 231.92 228.47 [1280, 720] CRNN with ['text_recognition_CRNN_EN_2023feb_fp16.onnx']
214
+ 259.68 249.43 228.47 [1280, 720] CRNN with ['text_recognition_CRNN_CH_2022oct_int8.onnx']
215
+ 260.89 283.44 228.47 [1280, 720] CRNN with ['text_recognition_CRNN_CN_2021nov_int8.onnx']
216
+ 255.61 249.41 222.38 [1280, 720] CRNN with ['text_recognition_CRNN_EN_2022oct_int8.onnx']
217
+ ```
218
+
219
+ GPU (CUDA-FP32):
220
+ ```
221
+ $ python3 benchmark.py --all --fp32 --cfg_exclude wechat --cfg_overwrite_backend_target 1
222
+ Benchmarking ...
223
+ backend=cv.dnn.DNN_BACKEND_CUDA
224
+ target=cv.dnn.DNN_TARGET_CUDA
225
+ mean median min input size model
226
+ 11.22 11.49 9.59 [160, 120] YuNet with ['face_detection_yunet_2022mar.onnx']
227
+ 24.60 25.91 24.16 [150, 150] SFace with ['face_recognition_sface_2021dec.onnx']
228
+ 20.64 24.00 18.88 [112, 112] FacialExpressionRecog with ['facial_expression_recognition_mobilefacenet_2022july.onnx']
229
+ 41.15 41.18 40.95 [224, 224] MPHandPose with ['handpose_estimation_mediapipe_2023feb.onnx']
230
+ 90.86 90.79 84.96 [192, 192] PPHumanSeg with ['human_segmentation_pphumanseg_2023mar.onnx']
231
+ 69.24 69.11 68.87 [224, 224] MobileNet with ['image_classification_mobilenetv1_2022apr.onnx']
232
+ 62.12 62.30 55.28 [224, 224] MobileNet with ['image_classification_mobilenetv2_2022apr.onnx']
233
+ 148.58 153.17 144.61 [224, 224] PPResNet with ['image_classification_ppresnet50_2022jan.onnx']
234
+ 53.50 54.29 51.48 [320, 240] LPD_YuNet with ['license_plate_detection_lpd_yunet_2023mar.onnx']
235
+ 214.99 218.04 212.94 [416, 416] NanoDet with ['object_detection_nanodet_2022nov.onnx']
236
+ 1238.91 1244.87 1227.30 [640, 640] YoloX with ['object_detection_yolox_2022nov.onnx']
237
+ 76.54 76.09 74.51 [1280, 720] DaSiamRPN with ['object_tracking_dasiamrpn_kernel_cls1_2021nov.onnx', 'object_tracking_dasiamrpn_kernel_r1_2021nov.onnx', 'object_tracking_dasiamrpn_model_2021nov.onnx']
238
+ 67.34 67.83 62.38 [192, 192] MPPalmDet with ['palm_detection_mediapipe_2023feb.onnx']
239
+ 126.65 126.63 124.96 [128, 256] YoutuReID with ['person_reid_youtu_2021nov.onnx']
240
+ 303.12 302.80 299.30 [640, 480] DB with ['text_detection_DB_IC15_resnet18_2021sep.onnx']
241
+ 302.58 299.78 297.83 [640, 480] DB with ['text_detection_DB_TD500_resnet18_2021sep.onnx']
242
+ 58.05 62.90 52.47 [1280, 720] CRNN with ['text_recognition_CRNN_CH_2021sep.onnx']
243
+ 59.39 56.82 52.47 [1280, 720] CRNN with ['text_recognition_CRNN_CN_2021nov.onnx']
244
+ 45.60 62.40 21.73 [1280, 720] CRNN with ['text_recognition_CRNN_EN_2021sep.onnx']
245
+ ```
246
+
247
+ GPU (CUDA-FP16):
248
+
249
+ ```
250
+ $ python3 benchmark.py --all --fp32 --cfg_exclude wechat --cfg_overwrite_backend_target 2
251
+ Benchmarking ...
252
+ backend=cv.dnn.DNN_BACKEND_CUDA
253
+ target=cv.dnn.DNN_TARGET_CUDA_FP16
254
+ mean median min input size model
255
+ 26.17 26.40 25.87 [160, 120] YuNet with ['face_detection_yunet_2022mar.onnx']
256
+ 116.07 115.93 112.39 [150, 150] SFace with ['face_recognition_sface_2021dec.onnx']
257
+ 119.85 121.62 114.63 [112, 112] FacialExpressionRecog with ['facial_expression_recognition_mobilefacenet_2022july.onnx']
258
+ 40.94 40.92 40.70 [224, 224] MPHandPose with ['handpose_estimation_mediapipe_2023feb.onnx']
259
+ 99.88 100.49 93.24 [192, 192] PPHumanSeg with ['human_segmentation_pphumanseg_2023mar.onnx']
260
+ 69.00 68.81 68.60 [224, 224] MobileNet with ['image_classification_mobilenetv1_2022apr.onnx']
261
+ 61.93 62.18 55.17 [224, 224] MobileNet with ['image_classification_mobilenetv2_2022apr.onnx']
262
+ 141.11 145.82 136.02 [224, 224] PPResNet with ['image_classification_ppresnet50_2022jan.onnx']
263
+ 364.70 363.48 360.28 [320, 240] LPD_YuNet with ['license_plate_detection_lpd_yunet_2023mar.onnx']
264
+ 215.23 213.49 213.06 [416, 416] NanoDet with ['object_detection_nanodet_2022nov.onnx']
265
+ 1223.32 1248.88 1213.25 [640, 640] YoloX with ['object_detection_yolox_2022nov.onnx']
266
+ 52.91 52.96 50.17 [1280, 720] DaSiamRPN with ['object_tracking_dasiamrpn_kernel_cls1_2021nov.onnx', 'object_tracking_dasiamrpn_kernel_r1_2021nov.onnx', 'object_tracking_dasiamrpn_model_2021nov.onnx']
267
+ 212.86 213.21 210.03 [192, 192] MPPalmDet with ['palm_detection_mediapipe_2023feb.onnx']
268
+ 96.68 94.21 89.24 [128, 256] YoutuReID with ['person_reid_youtu_2021nov.onnx']
269
+ 343.38 344.17 337.62 [640, 480] DB with ['text_detection_DB_IC15_resnet18_2021sep.onnx']
270
+ 344.29 345.07 337.62 [640, 480] DB with ['text_detection_DB_TD500_resnet18_2021sep.onnx']
271
+ 48.91 50.31 45.41 [1280, 720] CRNN with ['text_recognition_CRNN_CH_2021sep.onnx']
272
+ 50.20 49.66 45.41 [1280, 720] CRNN with ['text_recognition_CRNN_CN_2021nov.onnx']
273
+ 39.56 52.56 20.76 [1280, 720] CRNN with ['text_recognition_CRNN_EN_2021sep.onnx']
274
+ ```
275
+
276
+ ### Khadas VIM3
277
+
278
+ Specs: [details](https://www.khadas.com/vim3)
279
+ - (SoC) CPU: Amlogic A311D, 2.2 GHz Quad core ARM Cortex-A73 and 1.8 GHz dual core Cortex-A53
280
+ - NPU: 5 TOPS Performance NPU INT8 inference up to 1536 MAC Supports all major deep learning frameworks including TensorFlow and Caffe
281
+
282
+ CPU:
283
+
284
+ ```
285
+ $ python3 benchmark.py --all --model_exclude license_plate_detection_lpd_yunet_2023mar_int8.onnx:human_segmentation_pphumanseg_2023mar_int8.onnx
286
+ Benchmarking ...
287
+ backend=cv.dnn.DNN_BACKEND_OPENCV
288
+ target=cv.dnn.DNN_TARGET_CPU
289
+ mean median min input size model
290
+ 4.93 4.91 4.83 [160, 120] YuNet with ['face_detection_yunet_2022mar.onnx']
291
+ 5.30 5.31 4.83 [160, 120] YuNet with ['face_detection_yunet_2022mar_int8.onnx']
292
+ 60.02 61.00 57.85 [150, 150] SFace with ['face_recognition_sface_2021dec.onnx']
293
+ 70.27 74.77 57.85 [150, 150] SFace with ['face_recognition_sface_2021dec_int8.onnx']
294
+ 29.36 28.28 27.97 [112, 112] FacialExpressionRecog with ['facial_expression_recognition_mobilefacenet_2022july.onnx']
295
+ 34.66 34.12 27.97 [112, 112] FacialExpressionRecog with ['facial_expression_recognition_mobilefacenet_2022july_int8.onnx']
296
+ 38.60 37.72 36.79 [224, 224] MPHandPose with ['handpose_estimation_mediapipe_2023feb.onnx']
297
+ 41.57 41.91 36.79 [224, 224] MPHandPose with ['handpose_estimation_mediapipe_2023feb_int8.onnx']
298
+ 70.82 72.70 67.14 [192, 192] PPHumanSeg with ['human_segmentation_pphumanseg_2023mar.onnx']
299
+ 64.73 64.22 62.19 [224, 224] MobileNet with ['image_classification_mobilenetv1_2022apr.onnx']
300
+ 58.18 59.29 49.97 [224, 224] MobileNet with ['image_classification_mobilenetv2_2022apr.onnx']
301
+ 59.15 52.27 49.97 [224, 224] MobileNet with ['image_classification_mobilenetv1_2022apr_int8.onnx']
302
+ 57.38 55.13 49.97 [224, 224] MobileNet with ['image_classification_mobilenetv2_2022apr_int8.onnx']
303
+ 385.29 361.27 348.96 [224, 224] PPResNet with ['image_classification_ppresnet50_2022jan.onnx']
304
+ 352.90 395.79 328.06 [224, 224] PPResNet with ['image_classification_ppresnet50_2022jan_int8.onnx']
305
+ 122.17 123.58 119.43 [320, 240] LPD_YuNet with ['license_plate_detection_lpd_yunet_2023mar.onnx']
306
+ 208.25 217.96 195.76 [416, 416] NanoDet with ['object_detection_nanodet_2022nov.onnx']
307
+ 203.04 213.99 161.37 [416, 416] NanoDet with ['object_detection_nanodet_2022nov_int8.onnx']
308
+ 1189.83 1150.85 1138.93 [640, 640] YoloX with ['object_detection_yolox_2022nov.onnx']
309
+ 1137.18 1142.89 1080.23 [640, 640] YoloX with ['object_detection_yolox_2022nov_int8.onnx']
310
+ 428.66 524.98 391.33 [1280, 720] DaSiamRPN with ['object_tracking_dasiamrpn_kernel_cls1_2021nov.onnx', 'object_tracking_dasiamrpn_kernel_r1_2021nov.onnx', 'object_tracking_dasiamrpn_model_2021nov.onnx']
311
+ 66.91 67.09 64.90 [192, 192] MPPalmDet with ['palm_detection_mediapipe_2023feb.onnx']
312
+ 79.42 81.44 64.90 [192, 192] MPPalmDet with ['palm_detection_mediapipe_2023feb_int8.onnx']
313
+ 439.53 431.92 406.03 [128, 256] YoutuReID with ['person_reid_youtu_2021nov.onnx']
314
+ 358.63 379.93 296.32 [128, 256] YoutuReID with ['person_reid_youtu_2021nov_int8.onnx']
315
+ 5.29 5.30 5.21 [100, 100] WeChatQRCode with ['detect_2021nov.prototxt', 'detect_2021nov.caffemodel', 'sr_2021nov.prototxt', 'sr_2021nov.caffemodel']
316
+ 973.75 968.68 954.58 [640, 480] DB with ['text_detection_DB_IC15_resnet18_2021sep.onnx']
317
+ 961.44 959.29 935.29 [640, 480] DB with ['text_detection_DB_TD500_resnet18_2021sep.onnx']
318
+ 202.74 202.73 200.75 [1280, 720] CRNN with ['text_recognition_CRNN_CH_2021sep.onnx']
319
+ 217.07 217.26 200.75 [1280, 720] CRNN with ['text_recognition_CRNN_CN_2021nov.onnx']
320
+ 199.81 231.31 169.27 [1280, 720] CRNN with ['text_recognition_CRNN_EN_2021sep.onnx']
321
+ [ WARN:[email protected]] global onnx_graph_simplifier.cpp:804 getMatFromTensor DNN: load FP16 model as FP32 model, and it takes twice the FP16 RAM requirement.
322
+ 199.73 203.96 169.27 [1280, 720] CRNN with ['text_recognition_CRNN_CH_2023feb_fp16.onnx']
323
+ 192.97 175.68 169.27 [1280, 720] CRNN with ['text_recognition_CRNN_EN_2023feb_fp16.onnx']
324
+ 189.65 189.43 169.27 [1280, 720] CRNN with ['text_recognition_CRNN_CH_2022oct_int8.onnx']
325
+ 188.98 202.49 169.27 [1280, 720] CRNN with ['text_recognition_CRNN_CN_2021nov_int8.onnx']
326
+ 183.49 188.71 149.81 [1280, 720] CRNN with ['text_recognition_CRNN_EN_2022oct_int8.onnx']
327
+ ```
328
+
329
+ NPU (TIMVX):
330
+
331
+ ```
332
+ $ python3 benchmark.py --all --int8 --cfg_overwrite_backend_target 3 --model_exclude license_plate_detection_lpd_yunet_2023mar_int8.onnx:human_segmentation_pphumanseg_2023mar_int8.onnx
333
+ Benchmarking ...
334
+ backend=cv.dnn.DNN_BACKEND_TIMVX
335
+ target=cv.dnn.DNN_TARGET_NPU
336
+ mean median min input size model
337
+ 5.67 5.74 5.59 [160, 120] YuNet with ['face_detection_yunet_2022mar_int8.onnx']
338
+ 76.97 77.86 75.59 [150, 150] SFace with ['face_recognition_sface_2021dec_int8.onnx']
339
+ 40.38 39.41 38.12 [112, 112] FacialExpressionRecog with ['facial_expression_recognition_mobilefacenet_2022july_int8.onnx']
340
+ 44.36 45.77 42.06 [224, 224] MPHandPose with ['handpose_estimation_mediapipe_2023feb_int8.onnx']
341
+ 60.75 62.46 56.34 [224, 224] MobileNet with ['image_classification_mobilenetv1_2022apr_int8.onnx']
342
+ 57.40 58.10 52.11 [224, 224] MobileNet with ['image_classification_mobilenetv2_2022apr_int8.onnx']
343
+ 340.20 347.74 330.70 [224, 224] PPResNet with ['image_classification_ppresnet50_2022jan_int8.onnx']
344
+ 200.50 224.02 160.81 [416, 416] NanoDet with ['object_detection_nanodet_2022nov_int8.onnx']
345
+ 1103.24 1091.76 1059.77 [640, 640] YoloX with ['object_detection_yolox_2022nov_int8.onnx']
346
+ 95.92 102.80 92.77 [192, 192] MPPalmDet with ['palm_detection_mediapipe_2023feb_int8.onnx']
347
+ 307.90 310.52 302.46 [128, 256] YoutuReID with ['person_reid_youtu_2021nov_int8.onnx']
348
+ 178.71 178.87 177.84 [1280, 720] CRNN with ['text_recognition_CRNN_CH_2022oct_int8.onnx']
349
+ 183.51 183.72 177.84 [1280, 720] CRNN with ['text_recognition_CRNN_CN_2021nov_int8.onnx']
350
+ 172.06 189.19 149.19 [1280, 720] CRNN with ['text_recognition_CRNN_EN_2022oct_int8.onnx']
351
+ ```
352
+
353
+ ### Atlas 200 DK
354
+
355
+ Specs: [details_en](https://e.huawei.com/uk/products/cloud-computing-dc/atlas/atlas-200), [details_cn](https://www.hiascend.com/zh/hardware/developer-kit)
356
+ - (SoC) CPU: 8-core Coretext-A55 @ 1.6 GHz (max)
357
+ - NPU: Ascend 310, dual DaVinci AI cores, 22/16/8 TOPS INT8.
358
+
359
+ CPU:
360
+
361
+ ```
362
+ $ python3 benchmark.py --all --cfg_exclude wechat --model_exclude license_plate_detection_lpd_yunet_2023mar_int8.onnx:human_segmentation_pphumanseg_2023mar_int8.onnx
363
+ Benchmarking ...
364
+ backend=cv.dnn.DNN_BACKEND_OPENCV
365
+ target=cv.dnn.DNN_TARGET_CPU
366
+ mean median min input size model
367
+ 8.02 8.07 7.93 [160, 120] YuNet with ['face_detection_yunet_2022mar.onnx']
368
+ 9.44 9.34 7.93 [160, 120] YuNet with ['face_detection_yunet_2022mar_int8.onnx']
369
+ 104.51 112.90 102.07 [150, 150] SFace with ['face_recognition_sface_2021dec.onnx']
370
+ 131.49 147.17 102.07 [150, 150] SFace with ['face_recognition_sface_2021dec_int8.onnx']
371
+ 47.71 57.86 46.48 [112, 112] FacialExpressionRecog with ['facial_expression_recognition_mobilefacenet_2022july.onnx']
372
+ 59.26 59.07 46.48 [112, 112] FacialExpressionRecog with ['facial_expression_recognition_mobilefacenet_2022july_int8.onnx']
373
+ 57.95 58.02 57.30 [224, 224] MPHandPose with ['handpose_estimation_mediapipe_2023feb.onnx']
374
+ 65.52 70.76 57.30 [224, 224] MPHandPose with ['handpose_estimation_mediapipe_2023feb_int8.onnx']
375
+ 107.98 127.65 106.59 [192, 192] PPHumanSeg with ['human_segmentation_pphumanseg_2023mar.onnx']
376
+ 103.96 124.91 102.87 [224, 224] MobileNet with ['image_classification_mobilenetv1_2022apr.onnx']
377
+ 90.46 90.53 76.14 [224, 224] MobileNet with ['image_classification_mobilenetv2_2022apr.onnx']
378
+ 98.40 76.49 76.14 [224, 224] MobileNet with ['image_classification_mobilenetv1_2022apr_int8.onnx']
379
+ 98.06 95.36 76.14 [224, 224] MobileNet with ['image_classification_mobilenetv2_2022apr_int8.onnx']
380
+ 564.69 556.79 537.84 [224, 224] PPResNet with ['image_classification_ppresnet50_2022jan.onnx']
381
+ 621.54 661.56 537.84 [224, 224] PPResNet with ['image_classification_ppresnet50_2022jan_int8.onnx']
382
+ 226.08 216.89 216.07 [320, 240] LPD_YuNet with ['license_plate_detection_lpd_yunet_2023mar.onnx']
383
+ 343.08 346.39 315.99 [416, 416] NanoDet with ['object_detection_nanodet_2022nov.onnx']
384
+ 351.64 346.41 315.99 [416, 416] NanoDet with ['object_detection_nanodet_2022nov_int8.onnx']
385
+ 1995.97 1996.82 1967.76 [640, 640] YoloX with ['object_detection_yolox_2022nov.onnx']
386
+ 2060.87 2055.60 1967.76 [640, 640] YoloX with ['object_detection_yolox_2022nov_int8.onnx']
387
+ 701.08 708.52 685.49 [1280, 720] DaSiamRPN with ['object_tracking_dasiamrpn_kernel_cls1_2021nov.onnx', 'object_tracking_dasiamrpn_kernel_r1_2021nov.onnx', 'object_tracking_dasiamrpn_model_2021nov.onnx']
388
+ 105.23 105.14 105.00 [192, 192] MPPalmDet with ['palm_detection_mediapipe_2023feb.onnx']
389
+ 123.41 125.65 105.00 [192, 192] MPPalmDet with ['palm_detection_mediapipe_2023feb_int8.onnx']
390
+ 631.70 631.81 630.61 [128, 256] YoutuReID with ['person_reid_youtu_2021nov.onnx']
391
+ 595.32 599.48 565.32 [128, 256] YoutuReID with ['person_reid_youtu_2021nov_int8.onnx']
392
+ 1452.55 1453.75 1450.98 [640, 480] DB with ['text_detection_DB_IC15_resnet18_2021sep.onnx']
393
+ 1433.26 1432.08 1409.78 [640, 480] DB with ['text_detection_DB_TD500_resnet18_2021sep.onnx']
394
+ 299.36 299.92 298.75 [1280, 720] CRNN with ['text_recognition_CRNN_CH_2021sep.onnx']
395
+ 329.84 333.32 298.75 [1280, 720] CRNN with ['text_recognition_CRNN_CN_2021nov.onnx']
396
+ 303.65 367.68 262.48 [1280, 720] CRNN with ['text_recognition_CRNN_EN_2021sep.onnx']
397
+ [ WARN:[email protected]] global onnx_graph_simplifier.cpp:804 getMatFromTensor DNN: load FP16 model as FP32 model, and it takes twice the FP16 RAM requirement.
398
+ 299.60 315.91 262.48 [1280, 720] CRNN with ['text_recognition_CRNN_CH_2023feb_fp16.onnx']
399
+ 290.29 263.05 262.48 [1280, 720] CRNN with ['text_recognition_CRNN_EN_2023feb_fp16.onnx']
400
+ 290.41 279.30 262.48 [1280, 720] CRNN with ['text_recognition_CRNN_CH_2022oct_int8.onnx']
401
+ 294.61 295.36 262.48 [1280, 720] CRNN with ['text_recognition_CRNN_CN_2021nov_int8.onnx']
402
+ 289.53 279.60 262.48 [1280, 720] CRNN with ['text_recognition_CRNN_EN_2022oct_int8.onnx']
403
+ ```
404
+
405
+ NPU:
406
+
407
+ ```
408
+ $ python3 benchmark.py --all --fp32 --cfg_exclude wechat:dasiamrpn:crnn --cfg_overwrite_backend_target 4
409
+ Benchmarking ...
410
+ backend=cv.dnn.DNN_BACKEND_CANN
411
+ target=cv.dnn.DNN_TARGET_NPU
412
+ mean median min input size model
413
+ 2.24 2.21 2.19 [160, 120] YuNet with ['face_detection_yunet_2022mar.onnx']
414
+ 2.66 2.66 2.64 [150, 150] SFace with ['face_recognition_sface_2021dec.onnx']
415
+ 2.19 2.19 2.16 [112, 112] FacialExpressionRecog with ['facial_expression_recognition_mobilefacenet_2022july.onnx']
416
+ 6.27 6.22 6.17 [224, 224] MPHandPose with ['handpose_estimation_mediapipe_2023feb.onnx']
417
+ 6.94 6.94 6.85 [192, 192] PPHumanSeg with ['human_segmentation_pphumanseg_2023mar.onnx']
418
+ 5.15 5.13 5.10 [224, 224] MobileNet with ['image_classification_mobilenetv1_2022apr.onnx']
419
+ 5.41 5.42 5.10 [224, 224] MobileNet with ['image_classification_mobilenetv2_2022apr.onnx']
420
+ 6.99 6.99 6.95 [224, 224] PPResNet with ['image_classification_ppresnet50_2022jan.onnx']
421
+ 7.63 7.64 7.43 [320, 240] LPD_YuNet with ['license_plate_detection_lpd_yunet_2023mar.onnx']
422
+ 20.62 22.09 19.16 [416, 416] NanoDet with ['object_detection_nanodet_2022nov.onnx']
423
+ 28.59 28.60 27.91 [640, 640] YoloX with ['object_detection_yolox_2022nov.onnx']
424
+ 5.17 5.26 5.09 [192, 192] MPPalmDet with ['palm_detection_mediapipe_2023feb.onnx']
425
+ 5.58 5.57 5.54 [128, 256] YoutuReID with ['person_reid_youtu_2021nov.onnx']
426
+ 17.15 17.18 16.83 [640, 480] DB with ['text_detection_DB_IC15_resnet18_2021sep.onnx']
427
+ 17.95 18.61 16.83 [640, 480] DB with ['text_detection_DB_TD500_resnet18_2021sep.onnx']
428
  ```
 
benchmark/benchmark.py CHANGED
@@ -20,6 +20,13 @@ backend_target_pairs = [
20
  [cv.dnn.DNN_BACKEND_TIMVX, cv.dnn.DNN_TARGET_NPU],
21
  [cv.dnn.DNN_BACKEND_CANN, cv.dnn.DNN_TARGET_NPU]
22
  ]
 
 
 
 
 
 
 
23
 
24
  parser = argparse.ArgumentParser("Benchmarks for OpenCV Zoo.")
25
  parser.add_argument('--cfg', '-c', type=str,
@@ -33,9 +40,12 @@ parser.add_argument('--cfg_overwrite_backend_target', type=int, default=-1,
33
  {:d}: TIM-VX + NPU,
34
  {:d}: CANN + NPU
35
  '''.format(*[x for x in range(len(backend_target_pairs))]))
36
- parser.add_argument("--fp32", action="store_true", help="Runs models of float32 precision only.")
37
- parser.add_argument("--fp16", action="store_true", help="Runs models of float16 precision only.")
38
- parser.add_argument("--int8", action="store_true", help="Runs models of int8 precision only.")
 
 
 
39
  args = parser.parse_args()
40
 
41
  def build_from_cfg(cfg, registery, key=None, name=None):
@@ -100,6 +110,7 @@ class Benchmark:
100
  self._target = available_targets[target_id]
101
 
102
  self._benchmark_results = dict()
 
103
 
104
  def setBackendAndTarget(self, backend_id, target_id):
105
  self._backend = backend_id
@@ -110,56 +121,108 @@ class Benchmark:
110
 
111
  for idx, data in enumerate(self._dataloader):
112
  filename, input_data = data[:2]
113
- if filename not in self._benchmark_results:
114
- self._benchmark_results[filename] = dict()
115
  if isinstance(input_data, np.ndarray):
116
  size = [input_data.shape[1], input_data.shape[0]]
117
  else:
118
  size = input_data.getFrameSize()
119
- self._benchmark_results[filename][str(size)] = self._metric.forward(model, *data[1:])
120
 
121
- def printResults(self):
122
- for imgName, results in self._benchmark_results.items():
123
- print(' image: {}'.format(imgName))
124
- total_latency = 0
125
- for key, latency in results.items():
126
- total_latency += latency
127
- print(' {}, latency ({}): {:.4f} ms'.format(key, self._metric.getReduction(), latency))
 
 
 
 
 
 
 
128
 
129
  if __name__ == '__main__':
130
- assert args.cfg.endswith('yaml'), 'Currently support configs of yaml format only.'
131
- with open(args.cfg, 'r') as f:
132
- cfg = yaml.safe_load(f)
133
-
134
- # Instantiate benchmark
135
- benchmark = Benchmark(**cfg['Benchmark'])
136
-
137
- if args.cfg_overwrite_backend_target >= 0:
138
- backend_id = backend_target_pairs[args.backend_target][0]
139
- target_id = backend_target_pairs[args.backend_target][1]
140
- benchmark.setBackendAndTarget(backend_id, target_id)
141
-
142
- # Instantiate model
143
- model_config = cfg['Model']
144
- model_handler, model_paths = MODELS.get(model_config.pop('name'))
145
-
146
- _model_paths = []
147
- if args.fp32 or args.fp16 or args.int8:
148
- if args.fp32:
149
- _model_paths += model_paths['fp32']
150
- if args.fp16:
151
- _model_paths += model_paths['fp16']
152
- if args.int8:
153
- _model_paths += model_paths['int8']
154
  else:
155
- _model_paths = model_paths['fp32'] + model_paths['fp16'] + model_paths['int8']
156
-
157
- for model_path in _model_paths:
158
- model = model_handler(*model_path, **model_config)
159
- # Format model_path
160
- for i in range(len(model_path)):
161
- model_path[i] = model_path[i].split('/')[-1]
162
- print('Benchmarking {} with {}'.format(model.name, model_path))
163
- # Run benchmark
164
- benchmark.run(model)
165
- benchmark.printResults()
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
20
  [cv.dnn.DNN_BACKEND_TIMVX, cv.dnn.DNN_TARGET_NPU],
21
  [cv.dnn.DNN_BACKEND_CANN, cv.dnn.DNN_TARGET_NPU]
22
  ]
23
+ backend_target_str_pairs = [
24
+ ["cv.dnn.DNN_BACKEND_OPENCV", "cv.dnn.DNN_TARGET_CPU"],
25
+ ["cv.dnn.DNN_BACKEND_CUDA", "cv.dnn.DNN_TARGET_CUDA"],
26
+ ["cv.dnn.DNN_BACKEND_CUDA", "cv.dnn.DNN_TARGET_CUDA_FP16"],
27
+ ["cv.dnn.DNN_BACKEND_TIMVX", "cv.dnn.DNN_TARGET_NPU"],
28
+ ["cv.dnn.DNN_BACKEND_CANN", "cv.dnn.DNN_TARGET_NPU"]
29
+ ]
30
 
31
  parser = argparse.ArgumentParser("Benchmarks for OpenCV Zoo.")
32
  parser.add_argument('--cfg', '-c', type=str,
 
40
  {:d}: TIM-VX + NPU,
41
  {:d}: CANN + NPU
42
  '''.format(*[x for x in range(len(backend_target_pairs))]))
43
+ parser.add_argument("--cfg_exclude", type=str, help="Configs to be excluded when using --all. Split keywords with colons (:). Not sensitive to upper/lower case.")
44
+ parser.add_argument("--model_exclude", type=str, help="Models to be excluded. Split model names with colons (:). Sensitive to upper/lower case.")
45
+ parser.add_argument("--fp32", action="store_true", help="Benchmark models of float32 precision only.")
46
+ parser.add_argument("--fp16", action="store_true", help="Benchmark models of float16 precision only.")
47
+ parser.add_argument("--int8", action="store_true", help="Benchmark models of int8 precision only.")
48
+ parser.add_argument("--all", action="store_true", help="Benchmark all models")
49
  args = parser.parse_args()
50
 
51
  def build_from_cfg(cfg, registery, key=None, name=None):
 
110
  self._target = available_targets[target_id]
111
 
112
  self._benchmark_results = dict()
113
+ self._benchmark_results_brief = dict()
114
 
115
  def setBackendAndTarget(self, backend_id, target_id):
116
  self._backend = backend_id
 
121
 
122
  for idx, data in enumerate(self._dataloader):
123
  filename, input_data = data[:2]
124
+
 
125
  if isinstance(input_data, np.ndarray):
126
  size = [input_data.shape[1], input_data.shape[0]]
127
  else:
128
  size = input_data.getFrameSize()
 
129
 
130
+ if str(size) not in self._benchmark_results:
131
+ self._benchmark_results[str(size)] = dict()
132
+ self._benchmark_results[str(size)][filename] = self._metric.forward(model, *data[1:])
133
+
134
+ if str(size) not in self._benchmark_results_brief:
135
+ self._benchmark_results_brief[str(size)] = []
136
+ self._benchmark_results_brief[str(size)] += self._benchmark_results[str(size)][filename]
137
+
138
+ def printResults(self, model_name, model_path):
139
+ for imgSize, res in self._benchmark_results_brief.items():
140
+ mean, median, minimum = self._metric.getPerfStats(res)
141
+ print("{:<10.2f} {:<10.2f} {:<10.2f} {:<12} {} with {}".format(
142
+ mean, median, minimum, imgSize, model_name, model_path
143
+ ))
144
 
145
  if __name__ == '__main__':
146
+ cfgs = []
147
+ if args.cfg is not None:
148
+ assert args.cfg.endswith('yaml'), 'Currently support configs of yaml format only.'
149
+ with open(args.cfg, 'r') as f:
150
+ cfg = yaml.safe_load(f)
151
+ cfgs.append(cfg)
152
+ elif args.all:
153
+ excludes = []
154
+ if args.cfg_exclude is not None:
155
+ excludes = args.cfg_exclude.split(":")
156
+
157
+ for cfg_fname in sorted(os.listdir("config")):
158
+ skip_flag = False
159
+ for exc in excludes:
160
+ if exc.lower() in cfg_fname.lower():
161
+ skip_flag = True
162
+ if skip_flag:
163
+ # print("{} is skipped.".format(cfg_fname))
164
+ continue
165
+
166
+ assert cfg_fname.endswith("yaml"), "Currently support yaml configs only."
167
+ with open(os.path.join("config", cfg_fname), "r") as f:
168
+ cfg = yaml.safe_load(f)
169
+ cfgs.append(cfg)
170
  else:
171
+ raise NotImplementedError("Specify either one config or use flag --all for benchmark.")
172
+
173
+ print("Benchmarking ...")
174
+ if args.all:
175
+ backend_target_id = args.cfg_overwrite_backend_target if args.cfg_overwrite_backend_target >= 0 else 0
176
+ backend_str = backend_target_str_pairs[backend_target_id][0]
177
+ target_str = backend_target_str_pairs[backend_target_id][1]
178
+ print("backend={}".format(backend_str))
179
+ print("target={}".format(target_str))
180
+ print("{:<10} {:<10} {:<10} {:<12} {}".format("mean", "median", "min", "input size", "model"))
181
+ for cfg in cfgs:
182
+ # Instantiate benchmark
183
+ benchmark = Benchmark(**cfg['Benchmark'])
184
+
185
+ # Set backend and target
186
+ if args.cfg_overwrite_backend_target >= 0:
187
+ backend_id = backend_target_pairs[args.cfg_overwrite_backend_target][0]
188
+ target_id = backend_target_pairs[args.cfg_overwrite_backend_target][1]
189
+ benchmark.setBackendAndTarget(backend_id, target_id)
190
+
191
+ # Instantiate model
192
+ model_config = cfg['Model']
193
+ model_handler, model_paths = MODELS.get(model_config.pop('name'))
194
+
195
+ _model_paths = []
196
+ if args.fp32 or args.fp16 or args.int8:
197
+ if args.fp32:
198
+ _model_paths += model_paths['fp32']
199
+ if args.fp16:
200
+ _model_paths += model_paths['fp16']
201
+ if args.int8:
202
+ _model_paths += model_paths['int8']
203
+ else:
204
+ _model_paths = model_paths['fp32'] + model_paths['fp16'] + model_paths['int8']
205
+ # filter out excluded models
206
+ excludes = []
207
+ if args.model_exclude is not None:
208
+ excludes = args.model_exclude.split(":")
209
+ _model_paths_excluded = []
210
+ for model_path in _model_paths:
211
+ skip_flag = False
212
+ for mp in model_path:
213
+ for exc in excludes:
214
+ if exc in mp:
215
+ skip_flag = True
216
+ if skip_flag:
217
+ continue
218
+ _model_paths_excluded.append(model_path)
219
+ _model_paths = _model_paths_excluded
220
+
221
+ for model_path in _model_paths:
222
+ model = model_handler(*model_path, **model_config)
223
+ # Format model_path
224
+ for i in range(len(model_path)):
225
+ model_path[i] = model_path[i].split('/')[-1]
226
+ # Run benchmark
227
+ benchmark.run(model)
228
+ benchmark.printResults(model.name, model_path)
benchmark/config/face_detection_yunet.yaml CHANGED
@@ -6,11 +6,9 @@ Benchmark:
6
  files: ["group.jpg", "concerts.jpg", "dance.jpg"]
7
  sizes: # [[w1, h1], ...], Omit to run at original scale
8
  - [160, 120]
9
- - [640, 480]
10
  metric:
11
  warmup: 30
12
  repeat: 10
13
- reduction: "median"
14
  backend: "default"
15
  target: "cpu"
16
 
 
6
  files: ["group.jpg", "concerts.jpg", "dance.jpg"]
7
  sizes: # [[w1, h1], ...], Omit to run at original scale
8
  - [160, 120]
 
9
  metric:
10
  warmup: 30
11
  repeat: 10
 
12
  backend: "default"
13
  target: "cpu"
14
 
benchmark/config/face_recognition_sface.yaml CHANGED
@@ -7,7 +7,6 @@ Benchmark:
7
  metric: # 'sizes' is omitted since this model requires input of fixed size
8
  warmup: 30
9
  repeat: 10
10
- reduction: "median"
11
  backend: "default"
12
  target: "cpu"
13
 
 
7
  metric: # 'sizes' is omitted since this model requires input of fixed size
8
  warmup: 30
9
  repeat: 10
 
10
  backend: "default"
11
  target: "cpu"
12
 
benchmark/config/facial_expression_recognition.yaml CHANGED
@@ -7,7 +7,6 @@ Benchmark:
7
  metric: # 'sizes' is omitted since this model requires input of fixed size
8
  warmup: 30
9
  repeat: 10
10
- reduction: "median"
11
  backend: "default"
12
  target: "cpu"
13
 
 
7
  metric: # 'sizes' is omitted since this model requires input of fixed size
8
  warmup: 30
9
  repeat: 10
 
10
  backend: "default"
11
  target: "cpu"
12
 
benchmark/config/handpose_estimation_mediapipe.yaml CHANGED
@@ -9,7 +9,6 @@ Benchmark:
9
  metric:
10
  warmup: 30
11
  repeat: 10
12
- reduction: "median"
13
  backend: "default"
14
  target: "cpu"
15
 
 
9
  metric:
10
  warmup: 30
11
  repeat: 10
 
12
  backend: "default"
13
  target: "cpu"
14
 
benchmark/config/human_segmentation_pphumanseg.yaml CHANGED
@@ -9,7 +9,6 @@ Benchmark:
9
  metric:
10
  warmup: 30
11
  repeat: 10
12
- reduction: "median"
13
  backend: "default"
14
  target: "cpu"
15
 
 
9
  metric:
10
  warmup: 30
11
  repeat: 10
 
12
  backend: "default"
13
  target: "cpu"
14
 
benchmark/config/image_classification_mobilenet.yaml CHANGED
@@ -10,7 +10,6 @@ Benchmark:
10
  metric:
11
  warmup: 30
12
  repeat: 10
13
- reduction: "median"
14
  backend: "default"
15
  target: "cpu"
16
 
 
10
  metric:
11
  warmup: 30
12
  repeat: 10
 
13
  backend: "default"
14
  target: "cpu"
15
 
benchmark/config/image_classification_ppresnet.yaml CHANGED
@@ -10,7 +10,6 @@ Benchmark:
10
  metric:
11
  warmup: 30
12
  repeat: 10
13
- reduction: "median"
14
  backend: "default"
15
  target: "cpu"
16
 
 
10
  metric:
11
  warmup: 30
12
  repeat: 10
 
13
  backend: "default"
14
  target: "cpu"
15
 
benchmark/config/license_plate_detection_yunet.yaml CHANGED
@@ -9,7 +9,6 @@ Benchmark:
9
  metric:
10
  warmup: 30
11
  repeat: 10
12
- reduction: "median"
13
  backend: "default"
14
  target: "cpu"
15
 
 
9
  metric:
10
  warmup: 30
11
  repeat: 10
 
12
  backend: "default"
13
  target: "cpu"
14
 
benchmark/config/object_detection_nanodet.yaml CHANGED
@@ -9,7 +9,6 @@ Benchmark:
9
  metric:
10
  warmup: 30
11
  repeat: 10
12
- reduction: "median"
13
  backend: "default"
14
  target: "cpu"
15
 
 
9
  metric:
10
  warmup: 30
11
  repeat: 10
 
12
  backend: "default"
13
  target: "cpu"
14
 
benchmark/config/object_detection_yolox.yaml CHANGED
@@ -9,7 +9,6 @@ Benchmark:
9
  metric:
10
  warmup: 30
11
  repeat: 10
12
- reduction: "median"
13
  backend: "default"
14
  target: "cpu"
15
 
 
9
  metric:
10
  warmup: 30
11
  repeat: 10
 
12
  backend: "default"
13
  target: "cpu"
14
 
benchmark/config/object_tracking_dasiamrpn.yaml CHANGED
@@ -7,7 +7,6 @@ Benchmark:
7
  files: ["throw_cup.mp4"]
8
  metric:
9
  type: "Tracking"
10
- reduction: "gmean"
11
  backend: "default"
12
  target: "cpu"
13
 
 
7
  files: ["throw_cup.mp4"]
8
  metric:
9
  type: "Tracking"
 
10
  backend: "default"
11
  target: "cpu"
12
 
benchmark/config/palm_detection_mediapipe.yaml CHANGED
@@ -9,7 +9,6 @@ Benchmark:
9
  metric:
10
  warmup: 30
11
  repeat: 10
12
- reduction: "median"
13
  backend: "default"
14
  target: "cpu"
15
 
 
9
  metric:
10
  warmup: 30
11
  repeat: 10
 
12
  backend: "default"
13
  target: "cpu"
14
 
benchmark/config/person_reid_youtureid.yaml CHANGED
@@ -8,7 +8,6 @@ Benchmark:
8
  metric:
9
  warmup: 30
10
  repeat: 10
11
- reduction: "median"
12
  backend: "default"
13
  target: "cpu"
14
 
 
8
  metric:
9
  warmup: 30
10
  repeat: 10
 
11
  backend: "default"
12
  target: "cpu"
13
 
benchmark/config/qrcode_wechatqrcode.yaml CHANGED
@@ -6,11 +6,9 @@ Benchmark:
6
  files: ["opencv.png", "opencv_zoo.png"]
7
  sizes:
8
  - [100, 100]
9
- - [300, 300]
10
  metric:
11
  warmup: 30
12
  repeat: 10
13
- reduction: "median"
14
  backend: "default"
15
  target: "cpu"
16
 
 
6
  files: ["opencv.png", "opencv_zoo.png"]
7
  sizes:
8
  - [100, 100]
 
9
  metric:
10
  warmup: 30
11
  repeat: 10
 
12
  backend: "default"
13
  target: "cpu"
14
 
benchmark/config/text_detection_db.yaml CHANGED
@@ -9,7 +9,6 @@ Benchmark:
9
  metric:
10
  warmup: 30
11
  repeat: 10
12
- reduction: "median"
13
  backend: "default"
14
  target: "cpu"
15
 
 
9
  metric:
10
  warmup: 30
11
  repeat: 10
 
12
  backend: "default"
13
  target: "cpu"
14
 
benchmark/config/text_recognition_crnn.yaml CHANGED
@@ -7,7 +7,6 @@ Benchmark:
7
  metric: # 'sizes' is omitted since this model requires input of fixed size
8
  warmup: 30
9
  repeat: 10
10
- reduction: "median"
11
  backend: "default"
12
  target: "cpu"
13
 
 
7
  metric: # 'sizes' is omitted since this model requires input of fixed size
8
  warmup: 30
9
  repeat: 10
 
10
  backend: "default"
11
  target: "cpu"
12
 
benchmark/utils/metrics/base.py CHANGED
@@ -21,4 +21,4 @@ class Base(BaseMetric):
21
  model.infer(img)
22
  self._timer.stop()
23
 
24
- return self._getResult()
 
21
  model.infer(img)
22
  self._timer.stop()
23
 
24
+ return self._timer.getRecords()
benchmark/utils/metrics/base_metric.py CHANGED
@@ -6,7 +6,6 @@ class BaseMetric:
6
  def __init__(self, **kwargs):
7
  self._warmup = kwargs.pop('warmup', 3)
8
  self._repeat = kwargs.pop('repeat', 10)
9
- self._reduction = kwargs.pop('reduction', 'median')
10
 
11
  self._timer = Timer()
12
 
@@ -20,8 +19,8 @@ class BaseMetric:
20
  else:
21
  return records[mid]
22
 
23
- def _calcGMean(self, records, drop_largest=3):
24
- ''' Return the geometric mean of records after drop the first drop_largest
25
  '''
26
  l = len(records)
27
  if l <= drop_largest:
@@ -29,17 +28,14 @@ class BaseMetric:
29
  records_sorted = sorted(records, reverse=True)
30
  return sum(records_sorted[drop_largest:]) / (l - drop_largest)
31
 
32
- def _getResult(self):
33
- records = self._timer.getRecords()
34
- if self._reduction == 'median':
35
- return self._calcMedian(records)
36
- elif self._reduction == 'gmean':
37
- return self._calcGMean(records)
38
- else:
39
- raise NotImplementedError('Reduction {} is not supported'.format(self._reduction))
40
 
41
- def getReduction(self):
42
- return self._reduction
 
 
 
43
 
44
  def forward(self, model, *args, **kwargs):
45
- raise NotImplementedError('Not implemented')
 
6
  def __init__(self, **kwargs):
7
  self._warmup = kwargs.pop('warmup', 3)
8
  self._repeat = kwargs.pop('repeat', 10)
 
9
 
10
  self._timer = Timer()
11
 
 
19
  else:
20
  return records[mid]
21
 
22
+ def _calcMean(self, records, drop_largest=1):
23
+ ''' Return the mean of records after dropping drop_largest
24
  '''
25
  l = len(records)
26
  if l <= drop_largest:
 
28
  records_sorted = sorted(records, reverse=True)
29
  return sum(records_sorted[drop_largest:]) / (l - drop_largest)
30
 
31
+ def _calcMin(self, records):
32
+ return min(records)
 
 
 
 
 
 
33
 
34
+ def getPerfStats(self, records):
35
+ mean = self._calcMean(records, int(len(records) / 10))
36
+ median = self._calcMedian(records)
37
+ minimum = self._calcMin(records)
38
+ return [mean, median, minimum]
39
 
40
  def forward(self, model, *args, **kwargs):
41
+ raise NotImplementedError('Not implemented')
benchmark/utils/metrics/detection.py CHANGED
@@ -26,4 +26,4 @@ class Detection(BaseMetric):
26
  model.infer(img)
27
  self._timer.stop()
28
 
29
- return self._getResult()
 
26
  model.infer(img)
27
  self._timer.stop()
28
 
29
+ return self._timer.getRecords()
benchmark/utils/metrics/recognition.py CHANGED
@@ -28,4 +28,4 @@ class Recognition(BaseMetric):
28
  model.infer(img, None)
29
  self._timer.stop()
30
 
31
- return self._getResult()
 
28
  model.infer(img, None)
29
  self._timer.stop()
30
 
31
+ return self._timer.getRecords()
benchmark/utils/metrics/tracking.py CHANGED
@@ -8,8 +8,8 @@ class Tracking(BaseMetric):
8
  def __init__(self, **kwargs):
9
  super().__init__(**kwargs)
10
 
11
- if self._warmup or self._repeat:
12
- print('warmup and repeat in metric for tracking do not function.')
13
 
14
  def forward(self, model, *args, **kwargs):
15
  stream, first_frame, rois = args
@@ -23,4 +23,4 @@ class Tracking(BaseMetric):
23
  model.infer(frame)
24
  self._timer.stop()
25
 
26
- return self._getResult()
 
8
  def __init__(self, **kwargs):
9
  super().__init__(**kwargs)
10
 
11
+ # if self._warmup or self._repeat:
12
+ # print('warmup and repeat in metric for tracking do not function.')
13
 
14
  def forward(self, model, *args, **kwargs):
15
  stream, first_frame, rois = args
 
23
  model.infer(frame)
24
  self._timer.stop()
25
 
26
+ return self._timer.getRecords()
models/handpose_estimation_mediapipe/mp_handpose.py CHANGED
@@ -28,8 +28,8 @@ class MPHandPose:
28
  return self.__class__.__name__
29
 
30
  def setBackendAndTarget(self, backendId, targetId):
31
- self._backendId = backendId
32
- self._targetId = targetId
33
  self.model.setPreferableBackend(self.backend_id)
34
  self.model.setPreferableTarget(self.target_id)
35
 
 
28
  return self.__class__.__name__
29
 
30
  def setBackendAndTarget(self, backendId, targetId):
31
+ self.backend_id = backendId
32
+ self.target_id = targetId
33
  self.model.setPreferableBackend(self.backend_id)
34
  self.model.setPreferableTarget(self.target_id)
35
 
models/image_classification_mobilenet/mobilenet.py CHANGED
@@ -34,8 +34,8 @@ class MobileNet:
34
  return self.__class__.__name__
35
 
36
  def setBackendAndTarget(self, backendId, targetId):
37
- self._backendId = backendId
38
- self._targetId = targetId
39
  self.model.setPreferableBackend(self.backend_id)
40
  self.model.setPreferableTarget(self.target_id)
41
 
 
34
  return self.__class__.__name__
35
 
36
  def setBackendAndTarget(self, backendId, targetId):
37
+ self.backend_id = backendId
38
+ self.target_id = targetId
39
  self.model.setPreferableBackend(self.backend_id)
40
  self.model.setPreferableTarget(self.target_id)
41
 
models/object_detection_nanodet/nanodet.py CHANGED
@@ -38,8 +38,8 @@ class NanoDet:
38
  return self.__class__.__name__
39
 
40
  def setBackendAndTarget(self, backendId, targetId):
41
- self._backendId = backendId
42
- self._targetId = targetId
43
  self.net.setPreferableBackend(self.backend_id)
44
  self.net.setPreferableTarget(self.target_id)
45
 
 
38
  return self.__class__.__name__
39
 
40
  def setBackendAndTarget(self, backendId, targetId):
41
+ self.backend_id = backendId
42
+ self.target_id = targetId
43
  self.net.setPreferableBackend(self.backend_id)
44
  self.net.setPreferableTarget(self.target_id)
45
 
models/object_detection_yolox/yolox.py CHANGED
@@ -24,8 +24,8 @@ class YoloX:
24
  return self.__class__.__name__
25
 
26
  def setBackendAndTarget(self, backendId, targetId):
27
- self._backendId = backendId
28
- self._targetId = targetId
29
  self.net.setPreferableBackend(self.backendId)
30
  self.net.setPreferableTarget(self.targetId)
31
 
 
24
  return self.__class__.__name__
25
 
26
  def setBackendAndTarget(self, backendId, targetId):
27
+ self.backendId = backendId
28
+ self.targetId = targetId
29
  self.net.setPreferableBackend(self.backendId)
30
  self.net.setPreferableTarget(self.targetId)
31