ytfeng commited on
Commit
584bcfa
·
1 Parent(s): b57baee

Update benchmark and results for CANN backend (#137)

Browse files

* update benchmark and results

* update lpd_yunet & nanodet benchmark results

* update results for PP-HumanSeg, MobileNet-V2, MP-HandPose

* add some notes

Files changed (2) hide show
  1. README.md +28 -21
  2. benchmark/benchmark.py +5 -0
README.md CHANGED
@@ -4,6 +4,12 @@ A zoo for models tuned for OpenCV DNN with benchmarks on different platforms.
4
 
5
  Guidelines:
6
 
 
 
 
 
 
 
7
  - Clone this repo to download all models and demo scripts:
8
  ```shell
9
  # Install git-lfs from https://git-lfs.github.com/
@@ -15,27 +21,27 @@ Guidelines:
15
 
16
  ## Models & Benchmark Results
17
 
18
- | Model | Task | Input Size | INTEL-CPU (ms) | RPI-CPU (ms) | JETSON-GPU (ms) | KV3-NPU (ms) | D1-CPU (ms) |
19
- | ------------------------------------------------------- | ----------------------------- | ---------- | -------------- | ------------ | --------------- | ------------ | ----------- |
20
- | [YuNet](./models/face_detection_yunet) | Face Detection | 160x120 | 1.45 | 6.22 | 12.18 | 4.04 | 86.69 |
21
- | [SFace](./models/face_recognition_sface) | Face Recognition | 112x112 | 8.65 | 99.20 | 24.88 | 46.25 | --- |
22
- | [FER](./models/facial_expression_recognition/) | Facial Expression Recognition | 112x112 | 4.43 | 49.86 | 31.07 | 29.80 | --- |
23
- | [LPD-YuNet](./models/license_plate_detection_yunet/) | License Plate Detection | 320x240 | --- | 168.03 | 56.12 | 29.53 | --- |
24
- | [YOLOX](./models/object_detection_yolox/) | Object Detection | 640x640 | 176.68 | 1496.70 | 388.95 | 420.98 | --- |
25
- | [NanoDet](./models/object_detection_nanodet/) | Object Detection | 416x416 | 157.91 | 220.36 | 64.94 | 116.64 | --- |
26
- | [DB-IC15](./models/text_detection_db) | Text Detection | 640x480 | 142.91 | 2835.91 | 208.41 | --- | --- |
27
- | [DB-TD500](./models/text_detection_db) | Text Detection | 640x480 | 142.91 | 2841.71 | 210.51 | --- | --- |
28
- | [CRNN-EN](./models/text_recognition_crnn) | Text Recognition | 100x32 | 50.21 | 234.32 | 196.15 | 125.30 | --- |
29
- | [CRNN-CN](./models/text_recognition_crnn) | Text Recognition | 100x32 | 73.52 | 322.16 | 239.76 | 166.79 | --- |
30
- | [PP-ResNet](./models/image_classification_ppresnet) | Image Classification | 224x224 | 56.05 | 602.58 | 98.64 | 75.45 | --- |
31
- | [MobileNet-V1](./models/image_classification_mobilenet) | Image Classification | 224x224 | 9.04 | 92.25 | 33.18 | 145.66\* | --- |
32
- | [MobileNet-V2](./models/image_classification_mobilenet) | Image Classification | 224x224 | 8.86 | 74.03 | 31.92 | 146.31\* | --- |
33
- | [PP-HumanSeg](./models/human_segmentation_pphumanseg) | Human Segmentation | 192x192 | 19.92 | 105.32 | 67.97 | 74.77 | --- |
34
- | [WeChatQRCode](./models/qrcode_wechatqrcode) | QR Code Detection and Parsing | 100x100 | 7.04 | 37.68 | --- | --- | --- |
35
- | [DaSiamRPN](./models/object_tracking_dasiamrpn) | Object Tracking | 1280x720 | 36.15 | 705.48 | 76.82 | --- | --- |
36
- | [YoutuReID](./models/person_reid_youtureid) | Person Re-Identification | 128x256 | 35.81 | 521.98 | 90.07 | 44.61 | --- |
37
- | [MP-PalmDet](./models/palm_detection_mediapipe) | Palm Detection | 192x192 | 11.09 | 63.79 | 83.20 | 33.81 | --- |
38
- | [MP-HandPose](./models/handpose_estimation_mediapipe) | Hand Pose Estimation | 224x224 | 4.28 | 36.19 | 40.10 | 19.47 | --- |
39
 
40
  \*: Models are quantized in per-channel mode, which run slower than per-tensor quantized models on NPU.
41
 
@@ -45,6 +51,7 @@ Hardware Setup:
45
  - `RPI-CPU`: [Raspberry Pi 4B](https://www.raspberrypi.com/products/raspberry-pi-4-model-b/specifications/), Broadcom BCM2711, Quad core Cortex-A72 (ARM v8) 64-bit SoC @ 1.5GHz.
46
  - `JETSON-GPU`: [NVIDIA Jetson Nano B01](https://developer.nvidia.com/embedded/jetson-nano-developer-kit), 128-core NVIDIA Maxwell GPU.
47
  - `KV3-NPU`: [Khadas VIM3](https://www.khadas.com/vim3), 5TOPS Performance. Benchmarks are done using **quantized** models. You will need to compile OpenCV with TIM-VX following [this guide](https://github.com/opencv/opencv/wiki/TIM-VX-Backend-For-Running-OpenCV-On-NPU) to run benchmarks. The test results use the `per-tensor` quantization model by default.
 
48
  - `D1-CPU`: [Allwinner D1](https://d1.docs.aw-ol.com/en), [Xuantie C906 CPU](https://www.t-head.cn/product/C906?spm=a2ouz.12986968.0.0.7bfc1384auGNPZ) (RISC-V, RVV 0.7.1) @ 1.0GHz, 1 core. YuNet is supported for now. Visit [here](https://github.com/fengyuentau/opencv_zoo_cpp) for more details.
49
 
50
  ***Important Notes***:
 
4
 
5
  Guidelines:
6
 
7
+ - Install latest `opencv-python`:
8
+ ```shell
9
+ python3 -m pip install opencv-python
10
+ # Or upgrade to latest version
11
+ python3 -m pip install --upgrade opencv-python
12
+ ```
13
  - Clone this repo to download all models and demo scripts:
14
  ```shell
15
  # Install git-lfs from https://git-lfs.github.com/
 
21
 
22
  ## Models & Benchmark Results
23
 
24
+ | Model | Task | Input Size | INTEL-CPU (ms) | RPI-CPU (ms) | JETSON-GPU (ms) | KV3-NPU (ms) | Ascend-310 (ms) | D1-CPU (ms) |
25
+ | ------------------------------------------------------- | ----------------------------- | ---------- | -------------- | ------------ | --------------- | ------------ | --------------- | ----------- |
26
+ | [YuNet](./models/face_detection_yunet) | Face Detection | 160x120 | 1.45 | 6.22 | 12.18 | 4.04 | 1.73 | 86.69 |
27
+ | [SFace](./models/face_recognition_sface) | Face Recognition | 112x112 | 8.65 | 99.20 | 24.88 | 46.25 | 23.17 | --- |
28
+ | [FER](./models/facial_expression_recognition/) | Facial Expression Recognition | 112x112 | 4.43 | 49.86 | 31.07 | 29.80 | 10.12 | --- |
29
+ | [LPD-YuNet](./models/license_plate_detection_yunet/) | License Plate Detection | 320x240 | --- | 168.03 | 56.12 | 29.53 | 8.70 | --- |
30
+ | [YOLOX](./models/object_detection_yolox/) | Object Detection | 640x640 | 176.68 | 1496.70 | 388.95 | 420.98 | 29.10 | --- |
31
+ | [NanoDet](./models/object_detection_nanodet/) | Object Detection | 416x416 | 157.91 | 220.36 | 64.94 | 116.64 | 35.97 | --- |
32
+ | [DB-IC15](./models/text_detection_db) | Text Detection | 640x480 | 142.91 | 2835.91 | 208.41 | --- | 229.74 | --- |
33
+ | [DB-TD500](./models/text_detection_db) | Text Detection | 640x480 | 142.91 | 2841.71 | 210.51 | --- | 247.29 | --- |
34
+ | [CRNN-EN](./models/text_recognition_crnn) | Text Recognition | 100x32 | 50.21 | 234.32 | 196.15 | 125.30 | 101.03 | --- |
35
+ | [CRNN-CN](./models/text_recognition_crnn) | Text Recognition | 100x32 | 73.52 | 322.16 | 239.76 | 166.79 | 136.41 | --- |
36
+ | [PP-ResNet](./models/image_classification_ppresnet) | Image Classification | 224x224 | 56.05 | 602.58 | 98.64 | 75.45 | 6.99 | --- |
37
+ | [MobileNet-V1](./models/image_classification_mobilenet) | Image Classification | 224x224 | 9.04 | 92.25 | 33.18 | 145.66\* | 5.25 | --- |
38
+ | [MobileNet-V2](./models/image_classification_mobilenet) | Image Classification | 224x224 | 8.86 | 74.03 | 31.92 | 146.31\* | 5.82 | --- |
39
+ | [PP-HumanSeg](./models/human_segmentation_pphumanseg) | Human Segmentation | 192x192 | 19.92 | 105.32 | 67.97 | 74.77 | 7.07 | --- |
40
+ | [WeChatQRCode](./models/qrcode_wechatqrcode) | QR Code Detection and Parsing | 100x100 | 7.04 | 37.68 | --- | --- | --- | --- |
41
+ | [DaSiamRPN](./models/object_tracking_dasiamrpn) | Object Tracking | 1280x720 | 36.15 | 705.48 | 76.82 | --- | --- | --- |
42
+ | [YoutuReID](./models/person_reid_youtureid) | Person Re-Identification | 128x256 | 35.81 | 521.98 | 90.07 | 44.61 | 5.69 | --- |
43
+ | [MP-PalmDet](./models/palm_detection_mediapipe) | Palm Detection | 192x192 | 11.09 | 63.79 | 83.20 | 33.81 | 21.59 | --- |
44
+ | [MP-HandPose](./models/handpose_estimation_mediapipe) | Hand Pose Estimation | 224x224 | 4.28 | 36.19 | 40.10 | 19.47 | 6.02 | --- |
45
 
46
  \*: Models are quantized in per-channel mode, which run slower than per-tensor quantized models on NPU.
47
 
 
51
  - `RPI-CPU`: [Raspberry Pi 4B](https://www.raspberrypi.com/products/raspberry-pi-4-model-b/specifications/), Broadcom BCM2711, Quad core Cortex-A72 (ARM v8) 64-bit SoC @ 1.5GHz.
52
  - `JETSON-GPU`: [NVIDIA Jetson Nano B01](https://developer.nvidia.com/embedded/jetson-nano-developer-kit), 128-core NVIDIA Maxwell GPU.
53
  - `KV3-NPU`: [Khadas VIM3](https://www.khadas.com/vim3), 5TOPS Performance. Benchmarks are done using **quantized** models. You will need to compile OpenCV with TIM-VX following [this guide](https://github.com/opencv/opencv/wiki/TIM-VX-Backend-For-Running-OpenCV-On-NPU) to run benchmarks. The test results use the `per-tensor` quantization model by default.
54
+ - `Ascend-310`: [Ascend 310](https://e.huawei.com/uk/products/cloud-computing-dc/atlas/ascend-310), 22 TOPS@INT8. Benchmarks are done on [Atlas 200 DK AI Developer Kit](https://e.huawei.com/in/products/cloud-computing-dc/atlas/atlas-200). Get the latest OpenCV source code and build following [this guide](https://github.com/opencv/opencv/wiki/Huawei-CANN-Backend) to enable CANN backend.
55
  - `D1-CPU`: [Allwinner D1](https://d1.docs.aw-ol.com/en), [Xuantie C906 CPU](https://www.t-head.cn/product/C906?spm=a2ouz.12986968.0.0.7bfc1384auGNPZ) (RISC-V, RVV 0.7.1) @ 1.0GHz, 1 core. YuNet is supported for now. Visit [here](https://github.com/fengyuentau/opencv_zoo_cpp) for more details.
56
 
57
  ***Important Notes***:
benchmark/benchmark.py CHANGED
@@ -77,6 +77,11 @@ class Benchmark:
77
  available_targets['npu'] = cv.dnn.DNN_TARGET_NPU
78
  except:
79
  print('OpenCV is not compiled with TIM-VX backend enbaled. See https://github.com/opencv/opencv/wiki/TIM-VX-Backend-For-Running-OpenCV-On-NPU for more details on how to enable TIM-VX backend.')
 
 
 
 
 
80
 
81
  self._backend = available_backends[backend_id]
82
  self._target = available_targets[target_id]
 
77
  available_targets['npu'] = cv.dnn.DNN_TARGET_NPU
78
  except:
79
  print('OpenCV is not compiled with TIM-VX backend enbaled. See https://github.com/opencv/opencv/wiki/TIM-VX-Backend-For-Running-OpenCV-On-NPU for more details on how to enable TIM-VX backend.')
80
+ try:
81
+ available_backends['cann'] = cv.dnn.DNN_BACKEND_CANN
82
+ available_targets['npu'] = cv.dnn.DNN_TARGET_NPU
83
+ except:
84
+ print('OpenCV is not compiled with CANN backend enabled. See https://github.com/opencv/opencv/wiki/Huawei-CANN-Backend for more details on how to enable CANN backend.')
85
 
86
  self._backend = available_backends[backend_id]
87
  self._target = available_targets[target_id]