Update benchmark and results for CANN backend (#137)
Browse files* update benchmark and results
* update lpd_yunet & nanodet benchmark results
* update results for PP-HumanSeg, MobileNet-V2, MP-HandPose
* add some notes
- README.md +28 -21
- benchmark/benchmark.py +5 -0
README.md
CHANGED
@@ -4,6 +4,12 @@ A zoo for models tuned for OpenCV DNN with benchmarks on different platforms.
|
|
4 |
|
5 |
Guidelines:
|
6 |
|
|
|
|
|
|
|
|
|
|
|
|
|
7 |
- Clone this repo to download all models and demo scripts:
|
8 |
```shell
|
9 |
# Install git-lfs from https://git-lfs.github.com/
|
@@ -15,27 +21,27 @@ Guidelines:
|
|
15 |
|
16 |
## Models & Benchmark Results
|
17 |
|
18 |
-
| Model | Task | Input Size | INTEL-CPU (ms) | RPI-CPU (ms) | JETSON-GPU (ms) | KV3-NPU (ms) | D1-CPU (ms) |
|
19 |
-
| ------------------------------------------------------- | ----------------------------- | ---------- | -------------- | ------------ | --------------- | ------------ | ----------- |
|
20 |
-
| [YuNet](./models/face_detection_yunet) | Face Detection | 160x120 | 1.45 | 6.22 | 12.18 | 4.04 | 86.69 |
|
21 |
-
| [SFace](./models/face_recognition_sface) | Face Recognition | 112x112 | 8.65 | 99.20 | 24.88 | 46.25 | --- |
|
22 |
-
| [FER](./models/facial_expression_recognition/) | Facial Expression Recognition | 112x112 | 4.43 | 49.86 | 31.07 | 29.80 | --- |
|
23 |
-
| [LPD-YuNet](./models/license_plate_detection_yunet/) | License Plate Detection | 320x240 | --- | 168.03 | 56.12 | 29.53 | --- |
|
24 |
-
| [YOLOX](./models/object_detection_yolox/) | Object Detection | 640x640 | 176.68 | 1496.70 | 388.95 | 420.98 | --- |
|
25 |
-
| [NanoDet](./models/object_detection_nanodet/) | Object Detection | 416x416 | 157.91 | 220.36 | 64.94 | 116.64 | --- |
|
26 |
-
| [DB-IC15](./models/text_detection_db) | Text Detection | 640x480 | 142.91 | 2835.91 | 208.41 | --- | --- |
|
27 |
-
| [DB-TD500](./models/text_detection_db) | Text Detection | 640x480 | 142.91 | 2841.71 | 210.51 | --- | --- |
|
28 |
-
| [CRNN-EN](./models/text_recognition_crnn) | Text Recognition | 100x32 | 50.21 | 234.32 | 196.15 | 125.30 | --- |
|
29 |
-
| [CRNN-CN](./models/text_recognition_crnn) | Text Recognition | 100x32 | 73.52 | 322.16 | 239.76 | 166.79 | --- |
|
30 |
-
| [PP-ResNet](./models/image_classification_ppresnet) | Image Classification | 224x224 | 56.05 | 602.58 | 98.64 | 75.45 | --- |
|
31 |
-
| [MobileNet-V1](./models/image_classification_mobilenet) | Image Classification | 224x224 | 9.04 | 92.25 | 33.18 | 145.66\* | --- |
|
32 |
-
| [MobileNet-V2](./models/image_classification_mobilenet) | Image Classification | 224x224 | 8.86 | 74.03 | 31.92 | 146.31\* | --- |
|
33 |
-
| [PP-HumanSeg](./models/human_segmentation_pphumanseg) | Human Segmentation | 192x192 | 19.92 | 105.32 | 67.97 | 74.77 | --- |
|
34 |
-
| [WeChatQRCode](./models/qrcode_wechatqrcode) | QR Code Detection and Parsing | 100x100 | 7.04 | 37.68 | --- | --- | --- |
|
35 |
-
| [DaSiamRPN](./models/object_tracking_dasiamrpn) | Object Tracking | 1280x720 | 36.15 | 705.48 | 76.82 | --- | --- |
|
36 |
-
| [YoutuReID](./models/person_reid_youtureid) | Person Re-Identification | 128x256 | 35.81 | 521.98 | 90.07 | 44.61 | --- |
|
37 |
-
| [MP-PalmDet](./models/palm_detection_mediapipe) | Palm Detection | 192x192 | 11.09 | 63.79 | 83.20 | 33.81 | --- |
|
38 |
-
| [MP-HandPose](./models/handpose_estimation_mediapipe) | Hand Pose Estimation | 224x224 | 4.28 | 36.19 | 40.10 | 19.47 | --- |
|
39 |
|
40 |
\*: Models are quantized in per-channel mode, which run slower than per-tensor quantized models on NPU.
|
41 |
|
@@ -45,6 +51,7 @@ Hardware Setup:
|
|
45 |
- `RPI-CPU`: [Raspberry Pi 4B](https://www.raspberrypi.com/products/raspberry-pi-4-model-b/specifications/), Broadcom BCM2711, Quad core Cortex-A72 (ARM v8) 64-bit SoC @ 1.5GHz.
|
46 |
- `JETSON-GPU`: [NVIDIA Jetson Nano B01](https://developer.nvidia.com/embedded/jetson-nano-developer-kit), 128-core NVIDIA Maxwell GPU.
|
47 |
- `KV3-NPU`: [Khadas VIM3](https://www.khadas.com/vim3), 5TOPS Performance. Benchmarks are done using **quantized** models. You will need to compile OpenCV with TIM-VX following [this guide](https://github.com/opencv/opencv/wiki/TIM-VX-Backend-For-Running-OpenCV-On-NPU) to run benchmarks. The test results use the `per-tensor` quantization model by default.
|
|
|
48 |
- `D1-CPU`: [Allwinner D1](https://d1.docs.aw-ol.com/en), [Xuantie C906 CPU](https://www.t-head.cn/product/C906?spm=a2ouz.12986968.0.0.7bfc1384auGNPZ) (RISC-V, RVV 0.7.1) @ 1.0GHz, 1 core. YuNet is supported for now. Visit [here](https://github.com/fengyuentau/opencv_zoo_cpp) for more details.
|
49 |
|
50 |
***Important Notes***:
|
|
|
4 |
|
5 |
Guidelines:
|
6 |
|
7 |
+
- Install latest `opencv-python`:
|
8 |
+
```shell
|
9 |
+
python3 -m pip install opencv-python
|
10 |
+
# Or upgrade to latest version
|
11 |
+
python3 -m pip install --upgrade opencv-python
|
12 |
+
```
|
13 |
- Clone this repo to download all models and demo scripts:
|
14 |
```shell
|
15 |
# Install git-lfs from https://git-lfs.github.com/
|
|
|
21 |
|
22 |
## Models & Benchmark Results
|
23 |
|
24 |
+
| Model | Task | Input Size | INTEL-CPU (ms) | RPI-CPU (ms) | JETSON-GPU (ms) | KV3-NPU (ms) | Ascend-310 (ms) | D1-CPU (ms) |
|
25 |
+
| ------------------------------------------------------- | ----------------------------- | ---------- | -------------- | ------------ | --------------- | ------------ | --------------- | ----------- |
|
26 |
+
| [YuNet](./models/face_detection_yunet) | Face Detection | 160x120 | 1.45 | 6.22 | 12.18 | 4.04 | 1.73 | 86.69 |
|
27 |
+
| [SFace](./models/face_recognition_sface) | Face Recognition | 112x112 | 8.65 | 99.20 | 24.88 | 46.25 | 23.17 | --- |
|
28 |
+
| [FER](./models/facial_expression_recognition/) | Facial Expression Recognition | 112x112 | 4.43 | 49.86 | 31.07 | 29.80 | 10.12 | --- |
|
29 |
+
| [LPD-YuNet](./models/license_plate_detection_yunet/) | License Plate Detection | 320x240 | --- | 168.03 | 56.12 | 29.53 | 8.70 | --- |
|
30 |
+
| [YOLOX](./models/object_detection_yolox/) | Object Detection | 640x640 | 176.68 | 1496.70 | 388.95 | 420.98 | 29.10 | --- |
|
31 |
+
| [NanoDet](./models/object_detection_nanodet/) | Object Detection | 416x416 | 157.91 | 220.36 | 64.94 | 116.64 | 35.97 | --- |
|
32 |
+
| [DB-IC15](./models/text_detection_db) | Text Detection | 640x480 | 142.91 | 2835.91 | 208.41 | --- | 229.74 | --- |
|
33 |
+
| [DB-TD500](./models/text_detection_db) | Text Detection | 640x480 | 142.91 | 2841.71 | 210.51 | --- | 247.29 | --- |
|
34 |
+
| [CRNN-EN](./models/text_recognition_crnn) | Text Recognition | 100x32 | 50.21 | 234.32 | 196.15 | 125.30 | 101.03 | --- |
|
35 |
+
| [CRNN-CN](./models/text_recognition_crnn) | Text Recognition | 100x32 | 73.52 | 322.16 | 239.76 | 166.79 | 136.41 | --- |
|
36 |
+
| [PP-ResNet](./models/image_classification_ppresnet) | Image Classification | 224x224 | 56.05 | 602.58 | 98.64 | 75.45 | 6.99 | --- |
|
37 |
+
| [MobileNet-V1](./models/image_classification_mobilenet) | Image Classification | 224x224 | 9.04 | 92.25 | 33.18 | 145.66\* | 5.25 | --- |
|
38 |
+
| [MobileNet-V2](./models/image_classification_mobilenet) | Image Classification | 224x224 | 8.86 | 74.03 | 31.92 | 146.31\* | 5.82 | --- |
|
39 |
+
| [PP-HumanSeg](./models/human_segmentation_pphumanseg) | Human Segmentation | 192x192 | 19.92 | 105.32 | 67.97 | 74.77 | 7.07 | --- |
|
40 |
+
| [WeChatQRCode](./models/qrcode_wechatqrcode) | QR Code Detection and Parsing | 100x100 | 7.04 | 37.68 | --- | --- | --- | --- |
|
41 |
+
| [DaSiamRPN](./models/object_tracking_dasiamrpn) | Object Tracking | 1280x720 | 36.15 | 705.48 | 76.82 | --- | --- | --- |
|
42 |
+
| [YoutuReID](./models/person_reid_youtureid) | Person Re-Identification | 128x256 | 35.81 | 521.98 | 90.07 | 44.61 | 5.69 | --- |
|
43 |
+
| [MP-PalmDet](./models/palm_detection_mediapipe) | Palm Detection | 192x192 | 11.09 | 63.79 | 83.20 | 33.81 | 21.59 | --- |
|
44 |
+
| [MP-HandPose](./models/handpose_estimation_mediapipe) | Hand Pose Estimation | 224x224 | 4.28 | 36.19 | 40.10 | 19.47 | 6.02 | --- |
|
45 |
|
46 |
\*: Models are quantized in per-channel mode, which run slower than per-tensor quantized models on NPU.
|
47 |
|
|
|
51 |
- `RPI-CPU`: [Raspberry Pi 4B](https://www.raspberrypi.com/products/raspberry-pi-4-model-b/specifications/), Broadcom BCM2711, Quad core Cortex-A72 (ARM v8) 64-bit SoC @ 1.5GHz.
|
52 |
- `JETSON-GPU`: [NVIDIA Jetson Nano B01](https://developer.nvidia.com/embedded/jetson-nano-developer-kit), 128-core NVIDIA Maxwell GPU.
|
53 |
- `KV3-NPU`: [Khadas VIM3](https://www.khadas.com/vim3), 5TOPS Performance. Benchmarks are done using **quantized** models. You will need to compile OpenCV with TIM-VX following [this guide](https://github.com/opencv/opencv/wiki/TIM-VX-Backend-For-Running-OpenCV-On-NPU) to run benchmarks. The test results use the `per-tensor` quantization model by default.
|
54 |
+
- `Ascend-310`: [Ascend 310](https://e.huawei.com/uk/products/cloud-computing-dc/atlas/ascend-310), 22 TOPS@INT8. Benchmarks are done on [Atlas 200 DK AI Developer Kit](https://e.huawei.com/in/products/cloud-computing-dc/atlas/atlas-200). Get the latest OpenCV source code and build following [this guide](https://github.com/opencv/opencv/wiki/Huawei-CANN-Backend) to enable CANN backend.
|
55 |
- `D1-CPU`: [Allwinner D1](https://d1.docs.aw-ol.com/en), [Xuantie C906 CPU](https://www.t-head.cn/product/C906?spm=a2ouz.12986968.0.0.7bfc1384auGNPZ) (RISC-V, RVV 0.7.1) @ 1.0GHz, 1 core. YuNet is supported for now. Visit [here](https://github.com/fengyuentau/opencv_zoo_cpp) for more details.
|
56 |
|
57 |
***Important Notes***:
|
benchmark/benchmark.py
CHANGED
@@ -77,6 +77,11 @@ class Benchmark:
|
|
77 |
available_targets['npu'] = cv.dnn.DNN_TARGET_NPU
|
78 |
except:
|
79 |
print('OpenCV is not compiled with TIM-VX backend enbaled. See https://github.com/opencv/opencv/wiki/TIM-VX-Backend-For-Running-OpenCV-On-NPU for more details on how to enable TIM-VX backend.')
|
|
|
|
|
|
|
|
|
|
|
80 |
|
81 |
self._backend = available_backends[backend_id]
|
82 |
self._target = available_targets[target_id]
|
|
|
77 |
available_targets['npu'] = cv.dnn.DNN_TARGET_NPU
|
78 |
except:
|
79 |
print('OpenCV is not compiled with TIM-VX backend enbaled. See https://github.com/opencv/opencv/wiki/TIM-VX-Backend-For-Running-OpenCV-On-NPU for more details on how to enable TIM-VX backend.')
|
80 |
+
try:
|
81 |
+
available_backends['cann'] = cv.dnn.DNN_BACKEND_CANN
|
82 |
+
available_targets['npu'] = cv.dnn.DNN_TARGET_NPU
|
83 |
+
except:
|
84 |
+
print('OpenCV is not compiled with CANN backend enabled. See https://github.com/opencv/opencv/wiki/Huawei-CANN-Backend for more details on how to enable CANN backend.')
|
85 |
|
86 |
self._backend = available_backends[backend_id]
|
87 |
self._target = available_targets[target_id]
|