Satyam Goyal
commited on
Commit
·
e7d94f5
1
Parent(s):
87cd14e
Merge pull request #95 from Satgoy152:adding-doc
Browse filesImproved help messages for demo programs (#95)
- Added Demo Documentation
- Updated help messages
- Changed exception link
- README.md +12 -8
- models/face_detection_yunet/README.md +7 -2
- models/face_detection_yunet/demo.py +9 -9
- models/face_recognition_sface/README.md +8 -5
- models/face_recognition_sface/demo.py +8 -8
- models/handpose_estimation_mediapipe/demo.py +1 -1
- models/human_segmentation_pphumanseg/README.md +6 -2
- models/human_segmentation_pphumanseg/demo.py +5 -5
- models/image_classification_mobilenet/README.md +10 -7
- models/image_classification_mobilenet/demo.py +4 -4
- models/image_classification_ppresnet/README.md +8 -5
- models/image_classification_ppresnet/demo.py +4 -4
- models/license_plate_detection_yunet/README.md +5 -1
- models/license_plate_detection_yunet/demo.py +9 -9
- models/object_tracking_dasiamrpn/README.md +6 -1
- models/object_tracking_dasiamrpn/demo.py +6 -6
- models/palm_detection_mediapipe/README.md +6 -1
- models/palm_detection_mediapipe/demo.py +7 -7
- models/person_reid_youtureid/README.md +8 -3
- models/person_reid_youtureid/demo.py +1 -1
- models/qrcode_wechatqrcode/README.md +6 -1
- models/qrcode_wechatqrcode/demo.py +7 -7
- models/text_detection_db/README.md +7 -1
- models/text_detection_db/demo.py +11 -11
- models/text_recognition_crnn/README.md +17 -7
- models/text_recognition_crnn/demo.py +6 -6
README.md
CHANGED
@@ -3,19 +3,20 @@
|
|
3 |
A zoo for models tuned for OpenCV DNN with benchmarks on different platforms.
|
4 |
|
5 |
Guidelines:
|
|
|
6 |
- Clone this repo to download all models and demo scripts:
|
7 |
-
|
8 |
-
|
9 |
-
|
10 |
-
|
11 |
-
|
12 |
-
|
13 |
- To run benchmarks on your hardware settings, please refer to [benchmark/README](./benchmark/README.md).
|
14 |
|
15 |
## Models & Benchmark Results
|
16 |
|
17 |
-
| Model
|
18 |
-
|
19 |
| [YuNet](./models/face_detection_yunet) | Face Detection | 160x120 | 1.45 | 6.22 | 12.18 | 4.04 | 86.69 |
|
20 |
| [SFace](./models/face_recognition_sface) | Face Recognition | 112x112 | 8.65 | 99.20 | 24.88 | 46.25 | --- |
|
21 |
| [LPD-YuNet](./models/license_plate_detection_yunet/) | License Plate Detection | 320x240 | --- | 168.03 | 56.12 | 154.20\* | |
|
@@ -36,6 +37,7 @@ Guidelines:
|
|
36 |
\*: Models are quantized in per-channel mode, which run slower than per-tensor quantized models on NPU.
|
37 |
|
38 |
Hardware Setup:
|
|
|
39 |
- `INTEL-CPU`: [Intel Core i7-5930K](https://www.intel.com/content/www/us/en/products/sku/82931/intel-core-i75930k-processor-15m-cache-up-to-3-70-ghz/specifications.html) @ 3.50GHz, 6 cores, 12 threads.
|
40 |
- `RPI-CPU`: [Raspberry Pi 4B](https://www.raspberrypi.com/products/raspberry-pi-4-model-b/specifications/), Broadcom BCM2711, Quad core Cortex-A72 (ARM v8) 64-bit SoC @ 1.5GHz.
|
41 |
- `JETSON-GPU`: [NVIDIA Jetson Nano B01](https://developer.nvidia.com/embedded/jetson-nano-developer-kit), 128-core NVIDIA Maxwell GPU.
|
@@ -43,6 +45,7 @@ Hardware Setup:
|
|
43 |
- `D1-CPU`: [Allwinner D1](https://d1.docs.aw-ol.com/en), [Xuantie C906 CPU](https://www.t-head.cn/product/C906?spm=a2ouz.12986968.0.0.7bfc1384auGNPZ) (RISC-V, RVV 0.7.1) @ 1.0GHz, 1 core. YuNet is supported for now. Visit [here](https://github.com/fengyuentau/opencv_zoo_cpp) for more details.
|
44 |
|
45 |
***Important Notes***:
|
|
|
46 |
- The data under each column of hardware setups on the above table represents the elapsed time of an inference (preprocess, forward and postprocess).
|
47 |
- The time data is the median of 10 runs after some warmup runs. Different metrics may be applied to some specific models.
|
48 |
- Batch size is 1 for all benchmark results.
|
@@ -52,6 +55,7 @@ Hardware Setup:
|
|
52 |
## Some Examples
|
53 |
|
54 |
Some examples are listed below. You can find more in the directory of each model!
|
|
|
55 |
### Face Detection with [YuNet](./models/face_detection_yunet/)
|
56 |
|
57 |

|
|
|
3 |
A zoo for models tuned for OpenCV DNN with benchmarks on different platforms.
|
4 |
|
5 |
Guidelines:
|
6 |
+
|
7 |
- Clone this repo to download all models and demo scripts:
|
8 |
+
```shell
|
9 |
+
# Install git-lfs from https://git-lfs.github.com/
|
10 |
+
git clone https://github.com/opencv/opencv_zoo && cd opencv_zoo
|
11 |
+
git lfs install
|
12 |
+
git lfs pull
|
13 |
+
```
|
14 |
- To run benchmarks on your hardware settings, please refer to [benchmark/README](./benchmark/README.md).
|
15 |
|
16 |
## Models & Benchmark Results
|
17 |
|
18 |
+
| Model | Task | Input Size | INTEL-CPU (ms) | RPI-CPU (ms) | JETSON-GPU (ms) | KV3-NPU (ms) | D1-CPU (ms) |
|
19 |
+
| ---------------------------------------------------- | ----------------------------- | ---------- | -------------- | ------------ | --------------- | ------------ | ----------- |
|
20 |
| [YuNet](./models/face_detection_yunet) | Face Detection | 160x120 | 1.45 | 6.22 | 12.18 | 4.04 | 86.69 |
|
21 |
| [SFace](./models/face_recognition_sface) | Face Recognition | 112x112 | 8.65 | 99.20 | 24.88 | 46.25 | --- |
|
22 |
| [LPD-YuNet](./models/license_plate_detection_yunet/) | License Plate Detection | 320x240 | --- | 168.03 | 56.12 | 154.20\* | |
|
|
|
37 |
\*: Models are quantized in per-channel mode, which run slower than per-tensor quantized models on NPU.
|
38 |
|
39 |
Hardware Setup:
|
40 |
+
|
41 |
- `INTEL-CPU`: [Intel Core i7-5930K](https://www.intel.com/content/www/us/en/products/sku/82931/intel-core-i75930k-processor-15m-cache-up-to-3-70-ghz/specifications.html) @ 3.50GHz, 6 cores, 12 threads.
|
42 |
- `RPI-CPU`: [Raspberry Pi 4B](https://www.raspberrypi.com/products/raspberry-pi-4-model-b/specifications/), Broadcom BCM2711, Quad core Cortex-A72 (ARM v8) 64-bit SoC @ 1.5GHz.
|
43 |
- `JETSON-GPU`: [NVIDIA Jetson Nano B01](https://developer.nvidia.com/embedded/jetson-nano-developer-kit), 128-core NVIDIA Maxwell GPU.
|
|
|
45 |
- `D1-CPU`: [Allwinner D1](https://d1.docs.aw-ol.com/en), [Xuantie C906 CPU](https://www.t-head.cn/product/C906?spm=a2ouz.12986968.0.0.7bfc1384auGNPZ) (RISC-V, RVV 0.7.1) @ 1.0GHz, 1 core. YuNet is supported for now. Visit [here](https://github.com/fengyuentau/opencv_zoo_cpp) for more details.
|
46 |
|
47 |
***Important Notes***:
|
48 |
+
|
49 |
- The data under each column of hardware setups on the above table represents the elapsed time of an inference (preprocess, forward and postprocess).
|
50 |
- The time data is the median of 10 runs after some warmup runs. Different metrics may be applied to some specific models.
|
51 |
- Batch size is 1 for all benchmark results.
|
|
|
55 |
## Some Examples
|
56 |
|
57 |
Some examples are listed below. You can find more in the directory of each model!
|
58 |
+
|
59 |
### Face Detection with [YuNet](./models/face_detection_yunet/)
|
60 |
|
61 |

|
models/face_detection_yunet/README.md
CHANGED
@@ -3,14 +3,15 @@
|
|
3 |
YuNet is a light-weight, fast and accurate face detection model, which achieves 0.834(AP_easy), 0.824(AP_medium), 0.708(AP_hard) on the WIDER Face validation set.
|
4 |
|
5 |
Notes:
|
|
|
6 |
- Model source: [here](https://github.com/ShiqiYu/libfacedetection.train/blob/a61a428929148171b488f024b5d6774f93cdbc13/tasks/task1/onnx/yunet.onnx).
|
7 |
- For details on training this model, please visit https://github.com/ShiqiYu/libfacedetection.train.
|
8 |
- This ONNX model has fixed input shape, but OpenCV DNN infers on the exact shape of input image. See https://github.com/opencv/opencv_zoo/issues/44 for more information.
|
9 |
|
10 |
Results of accuracy evaluation with [tools/eval](../../tools/eval).
|
11 |
|
12 |
-
| Models | Easy AP | Medium AP | Hard AP |
|
13 |
-
|
14 |
| YuNet | 0.8498 | 0.8384 | 0.7357 |
|
15 |
| YuNet quant | 0.7751 | 0.8145 | 0.7312 |
|
16 |
|
@@ -19,11 +20,15 @@ Results of accuracy evaluation with [tools/eval](../../tools/eval).
|
|
19 |
## Demo
|
20 |
|
21 |
Run the following command to try the demo:
|
|
|
22 |
```shell
|
23 |
# detect on camera input
|
24 |
python demo.py
|
25 |
# detect on an image
|
26 |
python demo.py --input /path/to/image
|
|
|
|
|
|
|
27 |
```
|
28 |
|
29 |
### Example outputs
|
|
|
3 |
YuNet is a light-weight, fast and accurate face detection model, which achieves 0.834(AP_easy), 0.824(AP_medium), 0.708(AP_hard) on the WIDER Face validation set.
|
4 |
|
5 |
Notes:
|
6 |
+
|
7 |
- Model source: [here](https://github.com/ShiqiYu/libfacedetection.train/blob/a61a428929148171b488f024b5d6774f93cdbc13/tasks/task1/onnx/yunet.onnx).
|
8 |
- For details on training this model, please visit https://github.com/ShiqiYu/libfacedetection.train.
|
9 |
- This ONNX model has fixed input shape, but OpenCV DNN infers on the exact shape of input image. See https://github.com/opencv/opencv_zoo/issues/44 for more information.
|
10 |
|
11 |
Results of accuracy evaluation with [tools/eval](../../tools/eval).
|
12 |
|
13 |
+
| Models | Easy AP | Medium AP | Hard AP |
|
14 |
+
| ----------- | ------- | --------- | ------- |
|
15 |
| YuNet | 0.8498 | 0.8384 | 0.7357 |
|
16 |
| YuNet quant | 0.7751 | 0.8145 | 0.7312 |
|
17 |
|
|
|
20 |
## Demo
|
21 |
|
22 |
Run the following command to try the demo:
|
23 |
+
|
24 |
```shell
|
25 |
# detect on camera input
|
26 |
python demo.py
|
27 |
# detect on an image
|
28 |
python demo.py --input /path/to/image
|
29 |
+
|
30 |
+
# get help regarding various parameters
|
31 |
+
python demo.py --help
|
32 |
```
|
33 |
|
34 |
### Example outputs
|
models/face_detection_yunet/demo.py
CHANGED
@@ -22,25 +22,25 @@ def str2bool(v):
|
|
22 |
backends = [cv.dnn.DNN_BACKEND_OPENCV, cv.dnn.DNN_BACKEND_CUDA]
|
23 |
targets = [cv.dnn.DNN_TARGET_CPU, cv.dnn.DNN_TARGET_CUDA, cv.dnn.DNN_TARGET_CUDA_FP16]
|
24 |
help_msg_backends = "Choose one of the computation backends: {:d}: OpenCV implementation (default); {:d}: CUDA"
|
25 |
-
help_msg_targets = "
|
26 |
try:
|
27 |
backends += [cv.dnn.DNN_BACKEND_TIMVX]
|
28 |
targets += [cv.dnn.DNN_TARGET_NPU]
|
29 |
help_msg_backends += "; {:d}: TIMVX"
|
30 |
help_msg_targets += "; {:d}: NPU"
|
31 |
except:
|
32 |
-
print('This version of OpenCV does not support TIM-VX and NPU. Visit https://
|
33 |
|
34 |
parser = argparse.ArgumentParser(description='YuNet: A Fast and Accurate CNN-based Face Detector (https://github.com/ShiqiYu/libfacedetection).')
|
35 |
-
parser.add_argument('--input', '-i', type=str, help='
|
36 |
-
parser.add_argument('--model', '-m', type=str, default='face_detection_yunet_2022mar.onnx', help=
|
37 |
parser.add_argument('--backend', '-b', type=int, default=backends[0], help=help_msg_backends.format(*backends))
|
38 |
parser.add_argument('--target', '-t', type=int, default=targets[0], help=help_msg_targets.format(*targets))
|
39 |
-
parser.add_argument('--conf_threshold', type=float, default=0.9, help='Filter out faces of confidence < conf_threshold.')
|
40 |
-
parser.add_argument('--nms_threshold', type=float, default=0.3, help='Suppress bounding boxes of iou >= nms_threshold.')
|
41 |
-
parser.add_argument('--top_k', type=int, default=5000, help='Keep top_k bounding boxes before NMS.')
|
42 |
-
parser.add_argument('--save', '-s', type=str, default=False, help='Set
|
43 |
-
parser.add_argument('--vis', '-v', type=str2bool, default=True, help='
|
44 |
args = parser.parse_args()
|
45 |
|
46 |
def visualize(image, results, box_color=(0, 255, 0), text_color=(0, 0, 255), fps=None):
|
|
|
22 |
backends = [cv.dnn.DNN_BACKEND_OPENCV, cv.dnn.DNN_BACKEND_CUDA]
|
23 |
targets = [cv.dnn.DNN_TARGET_CPU, cv.dnn.DNN_TARGET_CUDA, cv.dnn.DNN_TARGET_CUDA_FP16]
|
24 |
help_msg_backends = "Choose one of the computation backends: {:d}: OpenCV implementation (default); {:d}: CUDA"
|
25 |
+
help_msg_targets = "Choose one of the target computation devices: {:d}: CPU (default); {:d}: CUDA; {:d}: CUDA fp16"
|
26 |
try:
|
27 |
backends += [cv.dnn.DNN_BACKEND_TIMVX]
|
28 |
targets += [cv.dnn.DNN_TARGET_NPU]
|
29 |
help_msg_backends += "; {:d}: TIMVX"
|
30 |
help_msg_targets += "; {:d}: NPU"
|
31 |
except:
|
32 |
+
print('This version of OpenCV does not support TIM-VX and NPU. Visit https://github.com/opencv/opencv/wiki/TIM-VX-Backend-For-Running-OpenCV-On-NPU for more information.')
|
33 |
|
34 |
parser = argparse.ArgumentParser(description='YuNet: A Fast and Accurate CNN-based Face Detector (https://github.com/ShiqiYu/libfacedetection).')
|
35 |
+
parser.add_argument('--input', '-i', type=str, help='Usage: Set input to a certain image, omit if using camera.')
|
36 |
+
parser.add_argument('--model', '-m', type=str, default='face_detection_yunet_2022mar.onnx', help="Usage: Set model type, defaults to 'face_detection_yunet_2022mar.onnx'.")
|
37 |
parser.add_argument('--backend', '-b', type=int, default=backends[0], help=help_msg_backends.format(*backends))
|
38 |
parser.add_argument('--target', '-t', type=int, default=targets[0], help=help_msg_targets.format(*targets))
|
39 |
+
parser.add_argument('--conf_threshold', type=float, default=0.9, help='Usage: Set the minimum needed confidence for the model to identify a face, defauts to 0.9. Smaller values may result in faster detection, but will limit accuracy. Filter out faces of confidence < conf_threshold.')
|
40 |
+
parser.add_argument('--nms_threshold', type=float, default=0.3, help='Usage: Suppress bounding boxes of iou >= nms_threshold. Default = 0.3.')
|
41 |
+
parser.add_argument('--top_k', type=int, default=5000, help='Usage: Keep top_k bounding boxes before NMS.')
|
42 |
+
parser.add_argument('--save', '-s', type=str, default=False, help='Usage: Set “True” to save file with results (i.e. bounding box, confidence level). Invalid in case of camera input. Default will be set to “False”.')
|
43 |
+
parser.add_argument('--vis', '-v', type=str2bool, default=True, help='Usage: Default will be set to “True” and will open a new window to show results. Set to “False” to stop visualizations from being shown. Invalid in case of camera input.')
|
44 |
args = parser.parse_args()
|
45 |
|
46 |
def visualize(image, results, box_color=(0, 255, 0), text_color=(0, 0, 255), fps=None):
|
models/face_recognition_sface/README.md
CHANGED
@@ -3,30 +3,33 @@
|
|
3 |
SFace: Sigmoid-Constrained Hypersphere Loss for Robust Face Recognition
|
4 |
|
5 |
Note:
|
|
|
6 |
- SFace is contributed by [Yaoyao Zhong](https://github.com/zhongyy/SFace).
|
7 |
- [face_recognition_sface_2021sep.onnx](./face_recognition_sface_2021sep.onnx) is converted from the model from https://github.com/zhongyy/SFace thanks to [Chengrui Wang](https://github.com/crywang).
|
8 |
- Support 5-landmark warpping for now (2021sep)
|
9 |
|
10 |
Results of accuracy evaluation with [tools/eval](../../tools/eval).
|
11 |
|
12 |
-
| Models | Accuracy |
|
13 |
-
|
14 |
| SFace | 0.9940 |
|
15 |
| SFace quant | 0.9932 |
|
16 |
|
17 |
\*: 'quant' stands for 'quantized'.
|
18 |
|
19 |
-
|
20 |
## Demo
|
21 |
|
22 |
***NOTE***: This demo uses [../face_detection_yunet](../face_detection_yunet) as face detector, which supports 5-landmark detection for now (2021sep).
|
23 |
|
24 |
Run the following command to try the demo:
|
|
|
25 |
```shell
|
26 |
# recognize on images
|
27 |
python demo.py --input1 /path/to/image1 --input2 /path/to/image2
|
28 |
-
```
|
29 |
|
|
|
|
|
|
|
30 |
|
31 |
## License
|
32 |
|
@@ -35,4 +38,4 @@ All files in this directory are licensed under [Apache 2.0 License](./LICENSE).
|
|
35 |
## Reference
|
36 |
|
37 |
- https://ieeexplore.ieee.org/document/9318547
|
38 |
-
- https://github.com/zhongyy/SFace
|
|
|
3 |
SFace: Sigmoid-Constrained Hypersphere Loss for Robust Face Recognition
|
4 |
|
5 |
Note:
|
6 |
+
|
7 |
- SFace is contributed by [Yaoyao Zhong](https://github.com/zhongyy/SFace).
|
8 |
- [face_recognition_sface_2021sep.onnx](./face_recognition_sface_2021sep.onnx) is converted from the model from https://github.com/zhongyy/SFace thanks to [Chengrui Wang](https://github.com/crywang).
|
9 |
- Support 5-landmark warpping for now (2021sep)
|
10 |
|
11 |
Results of accuracy evaluation with [tools/eval](../../tools/eval).
|
12 |
|
13 |
+
| Models | Accuracy |
|
14 |
+
| ----------- | -------- |
|
15 |
| SFace | 0.9940 |
|
16 |
| SFace quant | 0.9932 |
|
17 |
|
18 |
\*: 'quant' stands for 'quantized'.
|
19 |
|
|
|
20 |
## Demo
|
21 |
|
22 |
***NOTE***: This demo uses [../face_detection_yunet](../face_detection_yunet) as face detector, which supports 5-landmark detection for now (2021sep).
|
23 |
|
24 |
Run the following command to try the demo:
|
25 |
+
|
26 |
```shell
|
27 |
# recognize on images
|
28 |
python demo.py --input1 /path/to/image1 --input2 /path/to/image2
|
|
|
29 |
|
30 |
+
# get help regarding various parameters
|
31 |
+
python demo.py --help
|
32 |
+
```
|
33 |
|
34 |
## License
|
35 |
|
|
|
38 |
## Reference
|
39 |
|
40 |
- https://ieeexplore.ieee.org/document/9318547
|
41 |
+
- https://github.com/zhongyy/SFace
|
models/face_recognition_sface/demo.py
CHANGED
@@ -25,7 +25,7 @@ def str2bool(v):
|
|
25 |
|
26 |
backends = [cv.dnn.DNN_BACKEND_OPENCV, cv.dnn.DNN_BACKEND_CUDA]
|
27 |
targets = [cv.dnn.DNN_TARGET_CPU, cv.dnn.DNN_TARGET_CUDA, cv.dnn.DNN_TARGET_CUDA_FP16]
|
28 |
-
help_msg_backends = "Choose one of the computation backends: {:d}: OpenCV implementation (default); {:d}: CUDA"
|
29 |
help_msg_targets = "Chose one of the target computation devices: {:d}: CPU (default); {:d}: CUDA; {:d}: CUDA fp16"
|
30 |
try:
|
31 |
backends += [cv.dnn.DNN_BACKEND_TIMVX]
|
@@ -33,18 +33,18 @@ try:
|
|
33 |
help_msg_backends += "; {:d}: TIMVX"
|
34 |
help_msg_targets += "; {:d}: NPU"
|
35 |
except:
|
36 |
-
print('This version of OpenCV does not support TIM-VX and NPU. Visit https://
|
37 |
|
38 |
parser = argparse.ArgumentParser(
|
39 |
description="SFace: Sigmoid-Constrained Hypersphere Loss for Robust Face Recognition (https://ieeexplore.ieee.org/document/9318547)")
|
40 |
-
parser.add_argument('--input1', '-i1', type=str, help='
|
41 |
-
parser.add_argument('--input2', '-i2', type=str, help='
|
42 |
-
parser.add_argument('--model', '-m', type=str, default='face_recognition_sface_2021dec.onnx', help='
|
43 |
parser.add_argument('--backend', '-b', type=int, default=backends[0], help=help_msg_backends.format(*backends))
|
44 |
parser.add_argument('--target', '-t', type=int, default=targets[0], help=help_msg_targets.format(*targets))
|
45 |
-
parser.add_argument('--dis_type', type=int, choices=[0, 1], default=0, help='Distance type. \'0\': cosine, \'1\': norm_l1.')
|
46 |
-
parser.add_argument('--save', '-s', type=str, default=False, help='Set
|
47 |
-
parser.add_argument('--vis', '-v', type=str2bool, default=True, help='
|
48 |
args = parser.parse_args()
|
49 |
|
50 |
if __name__ == '__main__':
|
|
|
25 |
|
26 |
backends = [cv.dnn.DNN_BACKEND_OPENCV, cv.dnn.DNN_BACKEND_CUDA]
|
27 |
targets = [cv.dnn.DNN_TARGET_CPU, cv.dnn.DNN_TARGET_CUDA, cv.dnn.DNN_TARGET_CUDA_FP16]
|
28 |
+
help_msg_backends = "Choose one of the computation backends: {:d}: OpenCV implementation (default); {:d}: CUDA \n Usage: Set backend DNN model, defaults to cv.dnn.DNN_BACKEND_OPENCV (int = 0). Based on your OpenCV version, it may or may not support cv.dnn.DNN_BACKEND_TIMVX. More details: [https://gist.github.com/fengyuentau/5a7a5ba36328f2b763aea026c43fa45f]"
|
29 |
help_msg_targets = "Chose one of the target computation devices: {:d}: CPU (default); {:d}: CUDA; {:d}: CUDA fp16"
|
30 |
try:
|
31 |
backends += [cv.dnn.DNN_BACKEND_TIMVX]
|
|
|
33 |
help_msg_backends += "; {:d}: TIMVX"
|
34 |
help_msg_targets += "; {:d}: NPU"
|
35 |
except:
|
36 |
+
print('This version of OpenCV does not support TIM-VX and NPU. Visit https://github.com/opencv/opencv/wiki/TIM-VX-Backend-For-Running-OpenCV-On-NPU for more information.')
|
37 |
|
38 |
parser = argparse.ArgumentParser(
|
39 |
description="SFace: Sigmoid-Constrained Hypersphere Loss for Robust Face Recognition (https://ieeexplore.ieee.org/document/9318547)")
|
40 |
+
parser.add_argument('--input1', '-i1', type=str, help='Usage: Set path to the input image 1 (original face).')
|
41 |
+
parser.add_argument('--input2', '-i2', type=str, help='Usage: Set path to the input image 2 (comparison face).')
|
42 |
+
parser.add_argument('--model', '-m', type=str, default='face_recognition_sface_2021dec.onnx', help='Usage: Set model path, defaults to face_recognition_sface_2021dec.onnx.')
|
43 |
parser.add_argument('--backend', '-b', type=int, default=backends[0], help=help_msg_backends.format(*backends))
|
44 |
parser.add_argument('--target', '-t', type=int, default=targets[0], help=help_msg_targets.format(*targets))
|
45 |
+
parser.add_argument('--dis_type', type=int, choices=[0, 1], default=0, help='Usage: Distance type. \'0\': cosine, \'1\': norm_l1. Defaults to \'0\'')
|
46 |
+
parser.add_argument('--save', '-s', type=str, default=False, help='Usage: Set “True” to save file with results (i.e. bounding box, confidence level). Invalid in case of camera input. Default will be set to “False”.')
|
47 |
+
parser.add_argument('--vis', '-v', type=str2bool, default=True, help='Usage: Default will be set to “True” and will open a new window to show results. Set to “False” to stop visualizations from being shown. Invalid in case of camera input.')
|
48 |
args = parser.parse_args()
|
49 |
|
50 |
if __name__ == '__main__':
|
models/handpose_estimation_mediapipe/demo.py
CHANGED
@@ -27,7 +27,7 @@ try:
|
|
27 |
help_msg_backends += "; {:d}: TIMVX"
|
28 |
help_msg_targets += "; {:d}: NPU"
|
29 |
except:
|
30 |
-
print('This version of OpenCV does not support TIM-VX and NPU. Visit https://
|
31 |
|
32 |
parser = argparse.ArgumentParser(description='Hand Pose Estimation from MediaPipe')
|
33 |
parser.add_argument('--input', '-i', type=str, help='Path to the input image. Omit for using default camera.')
|
|
|
27 |
help_msg_backends += "; {:d}: TIMVX"
|
28 |
help_msg_targets += "; {:d}: NPU"
|
29 |
except:
|
30 |
+
print('This version of OpenCV does not support TIM-VX and NPU. Visit https://github.com/opencv/opencv/wiki/TIM-VX-Backend-For-Running-OpenCV-On-NPU for more information.')
|
31 |
|
32 |
parser = argparse.ArgumentParser(description='Hand Pose Estimation from MediaPipe')
|
33 |
parser.add_argument('--input', '-i', type=str, help='Path to the input image. Omit for using default camera.')
|
models/human_segmentation_pphumanseg/README.md
CHANGED
@@ -5,14 +5,18 @@ This model is ported from [PaddleHub](https://github.com/PaddlePaddle/PaddleHub)
|
|
5 |
## Demo
|
6 |
|
7 |
Run the following command to try the demo:
|
|
|
8 |
```shell
|
9 |
# detect on camera input
|
10 |
python demo.py
|
11 |
# detect on an image
|
12 |
python demo.py --input /path/to/image
|
|
|
|
|
|
|
13 |
```
|
14 |
|
15 |
-
|
16 |
|
17 |

|
18 |
|
@@ -26,4 +30,4 @@ All files in this directory are licensed under [Apache 2.0 License](./LICENSE).
|
|
26 |
|
27 |
- https://arxiv.org/abs/1512.03385
|
28 |
- https://github.com/opencv/opencv/tree/master/samples/dnn/dnn_model_runner/dnn_conversion/paddlepaddle
|
29 |
-
- https://github.com/PaddlePaddle/PaddleHub
|
|
|
5 |
## Demo
|
6 |
|
7 |
Run the following command to try the demo:
|
8 |
+
|
9 |
```shell
|
10 |
# detect on camera input
|
11 |
python demo.py
|
12 |
# detect on an image
|
13 |
python demo.py --input /path/to/image
|
14 |
+
|
15 |
+
# get help regarding various parameters
|
16 |
+
python demo.py --help
|
17 |
```
|
18 |
|
19 |
+
### Example outputs
|
20 |
|
21 |

|
22 |
|
|
|
30 |
|
31 |
- https://arxiv.org/abs/1512.03385
|
32 |
- https://github.com/opencv/opencv/tree/master/samples/dnn/dnn_model_runner/dnn_conversion/paddlepaddle
|
33 |
+
- https://github.com/PaddlePaddle/PaddleHub
|
models/human_segmentation_pphumanseg/demo.py
CHANGED
@@ -29,15 +29,15 @@ try:
|
|
29 |
help_msg_backends += "; {:d}: TIMVX"
|
30 |
help_msg_targets += "; {:d}: NPU"
|
31 |
except:
|
32 |
-
print('This version of OpenCV does not support TIM-VX and NPU. Visit https://
|
33 |
|
34 |
parser = argparse.ArgumentParser(description='PPHumanSeg (https://github.com/PaddlePaddle/PaddleSeg/tree/release/2.2/contrib/PP-HumanSeg)')
|
35 |
-
parser.add_argument('--input', '-i', type=str, help='
|
36 |
-
parser.add_argument('--model', '-m', type=str, default='human_segmentation_pphumanseg_2021oct.onnx', help='
|
37 |
parser.add_argument('--backend', '-b', type=int, default=backends[0], help=help_msg_backends.format(*backends))
|
38 |
parser.add_argument('--target', '-t', type=int, default=targets[0], help=help_msg_targets.format(*targets))
|
39 |
-
parser.add_argument('--save', '-s', type=str, default=False, help='Set
|
40 |
-
parser.add_argument('--vis', '-v', type=str2bool, default=True, help='
|
41 |
args = parser.parse_args()
|
42 |
|
43 |
def get_color_map_list(num_classes):
|
|
|
29 |
help_msg_backends += "; {:d}: TIMVX"
|
30 |
help_msg_targets += "; {:d}: NPU"
|
31 |
except:
|
32 |
+
print('This version of OpenCV does not support TIM-VX and NPU. Visit https://github.com/opencv/opencv/wiki/TIM-VX-Backend-For-Running-OpenCV-On-NPU for more information.')
|
33 |
|
34 |
parser = argparse.ArgumentParser(description='PPHumanSeg (https://github.com/PaddlePaddle/PaddleSeg/tree/release/2.2/contrib/PP-HumanSeg)')
|
35 |
+
parser.add_argument('--input', '-i', type=str, help='Usage: Set input path to a certain image, omit if using camera.')
|
36 |
+
parser.add_argument('--model', '-m', type=str, default='human_segmentation_pphumanseg_2021oct.onnx', help='Usage: Set model path, defaults to human_segmentation_pphumanseg_2021oct.onnx.')
|
37 |
parser.add_argument('--backend', '-b', type=int, default=backends[0], help=help_msg_backends.format(*backends))
|
38 |
parser.add_argument('--target', '-t', type=int, default=targets[0], help=help_msg_targets.format(*targets))
|
39 |
+
parser.add_argument('--save', '-s', type=str, default=False, help='Usage: Set “True” to save a file with results. Invalid in case of camera input. Default will be set to “False”.')
|
40 |
+
parser.add_argument('--vis', '-v', type=str2bool, default=True, help='Usage: Default will be set to “True” and will open a new window to show results. Set to “False” to stop visualizations from being shown. Invalid in case of camera input.')
|
41 |
args = parser.parse_args()
|
42 |
|
43 |
def get_color_map_list(num_classes):
|
models/image_classification_mobilenet/README.md
CHANGED
@@ -6,23 +6,27 @@ MobileNetV2: Inverted Residuals and Linear Bottlenecks
|
|
6 |
|
7 |
Results of accuracy evaluation with [tools/eval](../../tools/eval).
|
8 |
|
9 |
-
| Models
|
10 |
-
|
|
11 |
-
| MobileNet V1
|
12 |
-
| MobileNet V1 quant | 55.53
|
13 |
-
| MobileNet V2
|
14 |
-
| MobileNet V2 quant | 68.37
|
15 |
|
16 |
\*: 'quant' stands for 'quantized'.
|
17 |
|
18 |
## Demo
|
19 |
|
20 |
Run the following command to try the demo:
|
|
|
21 |
```shell
|
22 |
# MobileNet V1
|
23 |
python demo.py --input /path/to/image
|
24 |
# MobileNet V2
|
25 |
python demo.py --input /path/to/image --model v2
|
|
|
|
|
|
|
26 |
```
|
27 |
|
28 |
## License
|
@@ -35,4 +39,3 @@ All files in this directory are licensed under [Apache 2.0 License](./LICENSE).
|
|
35 |
- MobileNet V2: https://arxiv.org/abs/1801.04381
|
36 |
- MobileNet V1 weight and scripts for training: https://github.com/wjc852456/pytorch-mobilenet-v1
|
37 |
- MobileNet V2 weight: https://github.com/onnx/models/tree/main/vision/classification/mobilenet
|
38 |
-
|
|
|
6 |
|
7 |
Results of accuracy evaluation with [tools/eval](../../tools/eval).
|
8 |
|
9 |
+
| Models | Top-1 Accuracy | Top-5 Accuracy |
|
10 |
+
| ------------------ | -------------- | -------------- |
|
11 |
+
| MobileNet V1 | 67.64 | 87.97 |
|
12 |
+
| MobileNet V1 quant | 55.53 | 78.74 |
|
13 |
+
| MobileNet V2 | 69.44 | 89.23 |
|
14 |
+
| MobileNet V2 quant | 68.37 | 88.56 |
|
15 |
|
16 |
\*: 'quant' stands for 'quantized'.
|
17 |
|
18 |
## Demo
|
19 |
|
20 |
Run the following command to try the demo:
|
21 |
+
|
22 |
```shell
|
23 |
# MobileNet V1
|
24 |
python demo.py --input /path/to/image
|
25 |
# MobileNet V2
|
26 |
python demo.py --input /path/to/image --model v2
|
27 |
+
|
28 |
+
# get help regarding various parameters
|
29 |
+
python demo.py --help
|
30 |
```
|
31 |
|
32 |
## License
|
|
|
39 |
- MobileNet V2: https://arxiv.org/abs/1801.04381
|
40 |
- MobileNet V1 weight and scripts for training: https://github.com/wjc852456/pytorch-mobilenet-v1
|
41 |
- MobileNet V2 weight: https://github.com/onnx/models/tree/main/vision/classification/mobilenet
|
|
models/image_classification_mobilenet/demo.py
CHANGED
@@ -24,14 +24,14 @@ try:
|
|
24 |
help_msg_backends += "; {:d}: TIMVX"
|
25 |
help_msg_targets += "; {:d}: NPU"
|
26 |
except:
|
27 |
-
print('This version of OpenCV does not support TIM-VX and NPU. Visit https://
|
28 |
|
29 |
parser = argparse.ArgumentParser(description='Demo for MobileNet V1 & V2.')
|
30 |
-
parser.add_argument('--input', '-i', type=str, help='
|
31 |
-
parser.add_argument('--model', '-m', type=str, choices=['v1', 'v2', 'v1-q', 'v2-q'], default='v1', help='
|
32 |
parser.add_argument('--backend', '-b', type=int, default=backends[0], help=help_msg_backends.format(*backends))
|
33 |
parser.add_argument('--target', '-t', type=int, default=targets[0], help=help_msg_targets.format(*targets))
|
34 |
-
parser.add_argument('--label', '-l', type=str, default='./imagenet_labels.txt', help='
|
35 |
args = parser.parse_args()
|
36 |
|
37 |
if __name__ == '__main__':
|
|
|
24 |
help_msg_backends += "; {:d}: TIMVX"
|
25 |
help_msg_targets += "; {:d}: NPU"
|
26 |
except:
|
27 |
+
print('This version of OpenCV does not support TIM-VX and NPU. Visit https://github.com/opencv/opencv/wiki/TIM-VX-Backend-For-Running-OpenCV-On-NPU for more information.')
|
28 |
|
29 |
parser = argparse.ArgumentParser(description='Demo for MobileNet V1 & V2.')
|
30 |
+
parser.add_argument('--input', '-i', type=str, help='Usage: Set input path to a certain image, omit if using camera.')
|
31 |
+
parser.add_argument('--model', '-m', type=str, choices=['v1', 'v2', 'v1-q', 'v2-q'], default='v1', help='Usage: Set model type, defaults to image_classification_mobilenetv1_2022apr.onnx (v1).')
|
32 |
parser.add_argument('--backend', '-b', type=int, default=backends[0], help=help_msg_backends.format(*backends))
|
33 |
parser.add_argument('--target', '-t', type=int, default=targets[0], help=help_msg_targets.format(*targets))
|
34 |
+
parser.add_argument('--label', '-l', type=str, default='./imagenet_labels.txt', help='Usage: Set path to the different labels that will be used during the detection. Default list found in imagenet_labels.txt')
|
35 |
args = parser.parse_args()
|
36 |
|
37 |
if __name__ == '__main__':
|
models/image_classification_ppresnet/README.md
CHANGED
@@ -6,18 +6,22 @@ This model is ported from [PaddleHub](https://github.com/PaddlePaddle/PaddleHub)
|
|
6 |
|
7 |
Results of accuracy evaluation with [tools/eval](../../tools/eval).
|
8 |
|
9 |
-
| Models
|
10 |
-
|
|
11 |
-
| PP-ResNet
|
12 |
-
| PP-ResNet quant | 0.22
|
13 |
|
14 |
\*: 'quant' stands for 'quantized'.
|
15 |
|
16 |
## Demo
|
17 |
|
18 |
Run the following command to try the demo:
|
|
|
19 |
```shell
|
20 |
python demo.py --input /path/to/image
|
|
|
|
|
|
|
21 |
```
|
22 |
|
23 |
## License
|
@@ -29,4 +33,3 @@ All files in this directory are licensed under [Apache 2.0 License](./LICENSE).
|
|
29 |
- https://arxiv.org/abs/1512.03385
|
30 |
- https://github.com/opencv/opencv/tree/master/samples/dnn/dnn_model_runner/dnn_conversion/paddlepaddle
|
31 |
- https://github.com/PaddlePaddle/PaddleHub
|
32 |
-
|
|
|
6 |
|
7 |
Results of accuracy evaluation with [tools/eval](../../tools/eval).
|
8 |
|
9 |
+
| Models | Top-1 Accuracy | Top-5 Accuracy |
|
10 |
+
| --------------- | -------------- | -------------- |
|
11 |
+
| PP-ResNet | 82.28 | 96.15 |
|
12 |
+
| PP-ResNet quant | 0.22 | 0.96 |
|
13 |
|
14 |
\*: 'quant' stands for 'quantized'.
|
15 |
|
16 |
## Demo
|
17 |
|
18 |
Run the following command to try the demo:
|
19 |
+
|
20 |
```shell
|
21 |
python demo.py --input /path/to/image
|
22 |
+
|
23 |
+
# get help regarding various parameters
|
24 |
+
python demo.py --help
|
25 |
```
|
26 |
|
27 |
## License
|
|
|
33 |
- https://arxiv.org/abs/1512.03385
|
34 |
- https://github.com/opencv/opencv/tree/master/samples/dnn/dnn_model_runner/dnn_conversion/paddlepaddle
|
35 |
- https://github.com/PaddlePaddle/PaddleHub
|
|
models/image_classification_ppresnet/demo.py
CHANGED
@@ -29,14 +29,14 @@ try:
|
|
29 |
help_msg_backends += "; {:d}: TIMVX"
|
30 |
help_msg_targets += "; {:d}: NPU"
|
31 |
except:
|
32 |
-
print('This version of OpenCV does not support TIM-VX and NPU. Visit https://
|
33 |
|
34 |
parser = argparse.ArgumentParser(description='Deep Residual Learning for Image Recognition (https://arxiv.org/abs/1512.03385, https://github.com/PaddlePaddle/PaddleHub)')
|
35 |
-
parser.add_argument('--input', '-i', type=str, help='
|
36 |
-
parser.add_argument('--model', '-m', type=str, default='image_classification_ppresnet50_2022jan.onnx', help='
|
37 |
parser.add_argument('--backend', '-b', type=int, default=backends[0], help=help_msg_backends.format(*backends))
|
38 |
parser.add_argument('--target', '-t', type=int, default=targets[0], help=help_msg_targets.format(*targets))
|
39 |
-
parser.add_argument('--label', '-l', type=str, default='./imagenet_labels.txt', help='
|
40 |
args = parser.parse_args()
|
41 |
|
42 |
if __name__ == '__main__':
|
|
|
29 |
help_msg_backends += "; {:d}: TIMVX"
|
30 |
help_msg_targets += "; {:d}: NPU"
|
31 |
except:
|
32 |
+
print('This version of OpenCV does not support TIM-VX and NPU. Visit https://github.com/opencv/opencv/wiki/TIM-VX-Backend-For-Running-OpenCV-On-NPU for more information.')
|
33 |
|
34 |
parser = argparse.ArgumentParser(description='Deep Residual Learning for Image Recognition (https://arxiv.org/abs/1512.03385, https://github.com/PaddlePaddle/PaddleHub)')
|
35 |
+
parser.add_argument('--input', '-i', type=str, help='Usage: Set input path to a certain image, omit if using camera.')
|
36 |
+
parser.add_argument('--model', '-m', type=str, default='image_classification_ppresnet50_2022jan.onnx', help='Usage: Set model path, defaults to image_classification_ppresnet50_2022jan.onnx.')
|
37 |
parser.add_argument('--backend', '-b', type=int, default=backends[0], help=help_msg_backends.format(*backends))
|
38 |
parser.add_argument('--target', '-t', type=int, default=targets[0], help=help_msg_targets.format(*targets))
|
39 |
+
parser.add_argument('--label', '-l', type=str, default='./imagenet_labels.txt', help='Usage: Set path to the different labels that will be used during the detection. Default list found in imagenet_labels.txt')
|
40 |
args = parser.parse_args()
|
41 |
|
42 |
if __name__ == '__main__':
|
models/license_plate_detection_yunet/README.md
CHANGED
@@ -7,11 +7,14 @@ Please note that the model is trained with Chinese license plates, so the detect
|
|
7 |
## Demo
|
8 |
|
9 |
Run the following command to try the demo:
|
|
|
10 |
```shell
|
11 |
# detect on camera input
|
12 |
python demo.py
|
13 |
# detect on an image
|
14 |
python demo.py --input /path/to/image
|
|
|
|
|
15 |
```
|
16 |
|
17 |
### Example outputs
|
@@ -19,8 +22,9 @@ python demo.py --input /path/to/image
|
|
19 |

|
20 |
|
21 |
## License
|
|
|
22 |
All files in this directory are licensed under [Apache 2.0 License](./LICENSE)
|
23 |
|
24 |
## Reference
|
25 |
|
26 |
-
|
|
|
7 |
## Demo
|
8 |
|
9 |
Run the following command to try the demo:
|
10 |
+
|
11 |
```shell
|
12 |
# detect on camera input
|
13 |
python demo.py
|
14 |
# detect on an image
|
15 |
python demo.py --input /path/to/image
|
16 |
+
# get help regarding various parameters
|
17 |
+
python demo.py --help
|
18 |
```
|
19 |
|
20 |
### Example outputs
|
|
|
22 |

|
23 |
|
24 |
## License
|
25 |
+
|
26 |
All files in this directory are licensed under [Apache 2.0 License](./LICENSE)
|
27 |
|
28 |
## Reference
|
29 |
|
30 |
+
- https://github.com/ShiqiYu/libfacedetection.train
|
models/license_plate_detection_yunet/demo.py
CHANGED
@@ -23,19 +23,19 @@ try:
|
|
23 |
help_msg_backends += "; {:d}: TIMVX"
|
24 |
help_msg_targets += "; {:d}: NPU"
|
25 |
except:
|
26 |
-
print('This version of OpenCV does not support TIM-VX and NPU. Visit https://
|
27 |
|
28 |
parser = argparse.ArgumentParser(description='LPD-YuNet for License Plate Detection')
|
29 |
-
parser.add_argument('--input', '-i', type=str, help='
|
30 |
-
parser.add_argument('--model', '-m', type=str, default='license_plate_detection_lpd_yunet_2022may.onnx', help='
|
31 |
parser.add_argument('--backend', '-b', type=int, default=backends[0], help=help_msg_backends.format(*backends))
|
32 |
parser.add_argument('--target', '-t', type=int, default=targets[0], help=help_msg_targets.format(*targets))
|
33 |
-
parser.add_argument('--conf_threshold', type=float, default=0.9, help='Filter out faces of confidence < conf_threshold.')
|
34 |
-
parser.add_argument('--nms_threshold', type=float, default=0.3, help='Suppress bounding boxes of iou >= nms_threshold.')
|
35 |
-
parser.add_argument('--top_k', type=int, default=5000, help='Keep top_k bounding boxes before NMS.')
|
36 |
-
parser.add_argument('--keep_top_k', type=int, default=750, help='Keep keep_top_k bounding boxes after NMS.')
|
37 |
-
parser.add_argument('--save', '-s', type=str2bool, default=False, help='Set
|
38 |
-
parser.add_argument('--vis', '-v', type=str2bool, default=True, help='
|
39 |
args = parser.parse_args()
|
40 |
|
41 |
def visualize(image, dets, line_color=(0, 255, 0), text_color=(0, 0, 255), fps=None):
|
|
|
23 |
help_msg_backends += "; {:d}: TIMVX"
|
24 |
help_msg_targets += "; {:d}: NPU"
|
25 |
except:
|
26 |
+
print('This version of OpenCV does not support TIM-VX and NPU. Visit https://github.com/opencv/opencv/wiki/TIM-VX-Backend-For-Running-OpenCV-On-NPU for more information.')
|
27 |
|
28 |
parser = argparse.ArgumentParser(description='LPD-YuNet for License Plate Detection')
|
29 |
+
parser.add_argument('--input', '-i', type=str, help='Usage: Set path to the input image. Omit for using default camera.')
|
30 |
+
parser.add_argument('--model', '-m', type=str, default='license_plate_detection_lpd_yunet_2022may.onnx', help='Usage: Set model path, defaults to license_plate_detection_lpd_yunet_2022may.onnx.')
|
31 |
parser.add_argument('--backend', '-b', type=int, default=backends[0], help=help_msg_backends.format(*backends))
|
32 |
parser.add_argument('--target', '-t', type=int, default=targets[0], help=help_msg_targets.format(*targets))
|
33 |
+
parser.add_argument('--conf_threshold', type=float, default=0.9, help='Usage: Set the minimum needed confidence for the model to identify a license plate, defaults to 0.9. Smaller values may result in faster detection, but will limit accuracy. Filter out faces of confidence < conf_threshold.')
|
34 |
+
parser.add_argument('--nms_threshold', type=float, default=0.3, help='Usage: Suppress bounding boxes of iou >= nms_threshold. Default = 0.3. Suppress bounding boxes of iou >= nms_threshold.')
|
35 |
+
parser.add_argument('--top_k', type=int, default=5000, help='Usage: Keep top_k bounding boxes before NMS.')
|
36 |
+
parser.add_argument('--keep_top_k', type=int, default=750, help='Usage: Keep keep_top_k bounding boxes after NMS.')
|
37 |
+
parser.add_argument('--save', '-s', type=str2bool, default=False, help='Usage: Set “True” to save file with results (i.e. bounding box, confidence level). Invalid in case of camera input. Default will be set to “False”.')
|
38 |
+
parser.add_argument('--vis', '-v', type=str2bool, default=True, help='Usage: Default will be set to “True” and will open a new window to show results. Set to “False” to stop visualizations from being shown. Invalid in case of camera input.')
|
39 |
args = parser.parse_args()
|
40 |
|
41 |
def visualize(image, dets, line_color=(0, 255, 0), text_color=(0, 0, 255), fps=None):
|
models/object_tracking_dasiamrpn/README.md
CHANGED
@@ -3,17 +3,22 @@
|
|
3 |
[Distractor-aware Siamese Networks for Visual Object Tracking](https://arxiv.org/abs/1808.06048)
|
4 |
|
5 |
Note:
|
|
|
6 |
- Model source: [opencv/samples/dnn/diasiamrpn_tracker.cpp](https://github.com/opencv/opencv/blob/ceb94d52a104c0c1287a43dfa6ba72705fb78ac1/samples/dnn/dasiamrpn_tracker.cpp#L5-L7)
|
7 |
- Visit https://github.com/foolwood/DaSiamRPN for training details.
|
8 |
|
9 |
## Demo
|
10 |
|
11 |
Run the following command to try the demo:
|
|
|
12 |
```shell
|
13 |
# track on camera input
|
14 |
python demo.py
|
15 |
# track on video input
|
16 |
python demo.py --input /path/to/video
|
|
|
|
|
|
|
17 |
```
|
18 |
|
19 |
### Example outputs
|
@@ -29,4 +34,4 @@ All files in this directory are licensed under [Apache 2.0 License](./LICENSE).
|
|
29 |
- DaSiamRPN Official Repository: https://github.com/foolwood/DaSiamRPN
|
30 |
- Paper: https://arxiv.org/abs/1808.06048
|
31 |
- OpenCV API `TrackerDaSiamRPN` Doc: https://docs.opencv.org/4.x/de/d93/classcv_1_1TrackerDaSiamRPN.html
|
32 |
-
- OpenCV Sample: https://github.com/opencv/opencv/blob/4.x/samples/dnn/dasiamrpn_tracker.cpp
|
|
|
3 |
[Distractor-aware Siamese Networks for Visual Object Tracking](https://arxiv.org/abs/1808.06048)
|
4 |
|
5 |
Note:
|
6 |
+
|
7 |
- Model source: [opencv/samples/dnn/diasiamrpn_tracker.cpp](https://github.com/opencv/opencv/blob/ceb94d52a104c0c1287a43dfa6ba72705fb78ac1/samples/dnn/dasiamrpn_tracker.cpp#L5-L7)
|
8 |
- Visit https://github.com/foolwood/DaSiamRPN for training details.
|
9 |
|
10 |
## Demo
|
11 |
|
12 |
Run the following command to try the demo:
|
13 |
+
|
14 |
```shell
|
15 |
# track on camera input
|
16 |
python demo.py
|
17 |
# track on video input
|
18 |
python demo.py --input /path/to/video
|
19 |
+
|
20 |
+
# get help regarding various parameters
|
21 |
+
python demo.py --help
|
22 |
```
|
23 |
|
24 |
### Example outputs
|
|
|
34 |
- DaSiamRPN Official Repository: https://github.com/foolwood/DaSiamRPN
|
35 |
- Paper: https://arxiv.org/abs/1808.06048
|
36 |
- OpenCV API `TrackerDaSiamRPN` Doc: https://docs.opencv.org/4.x/de/d93/classcv_1_1TrackerDaSiamRPN.html
|
37 |
+
- OpenCV Sample: https://github.com/opencv/opencv/blob/4.x/samples/dnn/dasiamrpn_tracker.cpp
|
models/object_tracking_dasiamrpn/demo.py
CHANGED
@@ -21,12 +21,12 @@ def str2bool(v):
|
|
21 |
|
22 |
parser = argparse.ArgumentParser(
|
23 |
description="Distractor-aware Siamese Networks for Visual Object Tracking (https://arxiv.org/abs/1808.06048)")
|
24 |
-
parser.add_argument('--input', '-i', type=str, help='
|
25 |
-
parser.add_argument('--model_path', type=str, default='object_tracking_dasiamrpn_model_2021nov.onnx', help='
|
26 |
-
parser.add_argument('--kernel_cls1_path', type=str, default='object_tracking_dasiamrpn_kernel_cls1_2021nov.onnx', help='
|
27 |
-
parser.add_argument('--kernel_r1_path', type=str, default='object_tracking_dasiamrpn_kernel_r1_2021nov.onnx', help='
|
28 |
-
parser.add_argument('--save', '-s', type=str2bool, default=False, help='Set
|
29 |
-
parser.add_argument('--vis', '-v', type=str2bool, default=True, help='
|
30 |
args = parser.parse_args()
|
31 |
|
32 |
def visualize(image, bbox, score, isLocated, fps=None, box_color=(0, 255, 0),text_color=(0, 255, 0), fontScale = 1, fontSize = 1):
|
|
|
21 |
|
22 |
parser = argparse.ArgumentParser(
|
23 |
description="Distractor-aware Siamese Networks for Visual Object Tracking (https://arxiv.org/abs/1808.06048)")
|
24 |
+
parser.add_argument('--input', '-i', type=str, help='Usage: Set path to the input video. Omit for using default camera.')
|
25 |
+
parser.add_argument('--model_path', type=str, default='object_tracking_dasiamrpn_model_2021nov.onnx', help='Usage: Set model path, defaults to object_tracking_dasiamrpn_model_2021nov.onnx.')
|
26 |
+
parser.add_argument('--kernel_cls1_path', type=str, default='object_tracking_dasiamrpn_kernel_cls1_2021nov.onnx', help='Usage: Set path to dasiamrpn_kernel_cls1.onnx.')
|
27 |
+
parser.add_argument('--kernel_r1_path', type=str, default='object_tracking_dasiamrpn_kernel_r1_2021nov.onnx', help='Usage: Set path to dasiamrpn_kernel_r1.onnx.')
|
28 |
+
parser.add_argument('--save', '-s', type=str2bool, default=False, help='Usage: Set “True” to save a file with results. Invalid in case of camera input. Default will be set to “False”.')
|
29 |
+
parser.add_argument('--vis', '-v', type=str2bool, default=True, help='Usage: Default will be set to “True” and will open a new window to show results. Set to “False” to stop visualizations from being shown. Invalid in case of camera input.')
|
30 |
args = parser.parse_args()
|
31 |
|
32 |
def visualize(image, bbox, score, isLocated, fps=None, box_color=(0, 255, 0),text_color=(0, 255, 0), fontScale = 1, fontSize = 1):
|
models/palm_detection_mediapipe/README.md
CHANGED
@@ -1,20 +1,25 @@
|
|
1 |
# Palm detector from MediaPipe Handpose
|
2 |
|
3 |
This model detects palm bounding boxes and palm landmarks, and is converted from Tensorflow-JS to ONNX using following tools:
|
|
|
4 |
- tfjs to tf_saved_model: https://github.com/patlevin/tfjs-to-tf/
|
5 |
- tf_saved_model to ONNX: https://github.com/onnx/tensorflow-onnx
|
6 |
- simplified by [onnx-simplifier](https://github.com/daquexian/onnx-simplifier)
|
7 |
|
8 |
-
Also note that the model is quantized in per-channel mode with [Intel
|
9 |
|
10 |
## Demo
|
11 |
|
12 |
Run the following commands to try the demo:
|
|
|
13 |
```bash
|
14 |
# detect on camera input
|
15 |
python demo.py
|
16 |
# detect on an image
|
17 |
python demo.py -i /path/to/image
|
|
|
|
|
|
|
18 |
```
|
19 |
|
20 |
### Example outputs
|
|
|
1 |
# Palm detector from MediaPipe Handpose
|
2 |
|
3 |
This model detects palm bounding boxes and palm landmarks, and is converted from Tensorflow-JS to ONNX using following tools:
|
4 |
+
|
5 |
- tfjs to tf_saved_model: https://github.com/patlevin/tfjs-to-tf/
|
6 |
- tf_saved_model to ONNX: https://github.com/onnx/tensorflow-onnx
|
7 |
- simplified by [onnx-simplifier](https://github.com/daquexian/onnx-simplifier)
|
8 |
|
9 |
+
Also note that the model is quantized in per-channel mode with [Intel's neural compressor](https://github.com/intel/neural-compressor), which gives better accuracy but may lose some speed.
|
10 |
|
11 |
## Demo
|
12 |
|
13 |
Run the following commands to try the demo:
|
14 |
+
|
15 |
```bash
|
16 |
# detect on camera input
|
17 |
python demo.py
|
18 |
# detect on an image
|
19 |
python demo.py -i /path/to/image
|
20 |
+
|
21 |
+
# get help regarding various parameters
|
22 |
+
python demo.py --help
|
23 |
```
|
24 |
|
25 |
### Example outputs
|
models/palm_detection_mediapipe/demo.py
CHANGED
@@ -23,17 +23,17 @@ try:
|
|
23 |
help_msg_backends += "; {:d}: TIMVX"
|
24 |
help_msg_targets += "; {:d}: NPU"
|
25 |
except:
|
26 |
-
print('This version of OpenCV does not support TIM-VX and NPU. Visit https://
|
27 |
|
28 |
parser = argparse.ArgumentParser(description='Hand Detector from MediaPipe')
|
29 |
-
parser.add_argument('--input', '-i', type=str, help='
|
30 |
-
parser.add_argument('--model', '-m', type=str, default='./palm_detection_mediapipe_2022may.onnx', help='
|
31 |
parser.add_argument('--backend', '-b', type=int, default=backends[0], help=help_msg_backends.format(*backends))
|
32 |
parser.add_argument('--target', '-t', type=int, default=targets[0], help=help_msg_targets.format(*targets))
|
33 |
-
parser.add_argument('--score_threshold', type=float, default=0.99, help='Filter out faces of confidence < conf_threshold. An empirical score threshold for the quantized model is 0.49.')
|
34 |
-
parser.add_argument('--nms_threshold', type=float, default=0.3, help='Suppress bounding boxes of iou >= nms_threshold.')
|
35 |
-
parser.add_argument('--save', '-s', type=str, default=False, help='Set
|
36 |
-
parser.add_argument('--vis', '-v', type=str2bool, default=True, help='
|
37 |
args = parser.parse_args()
|
38 |
|
39 |
def visualize(image, results, print_results=False, fps=None):
|
|
|
23 |
help_msg_backends += "; {:d}: TIMVX"
|
24 |
help_msg_targets += "; {:d}: NPU"
|
25 |
except:
|
26 |
+
print('This version of OpenCV does not support TIM-VX and NPU. Visit https://github.com/opencv/opencv/wiki/TIM-VX-Backend-For-Running-OpenCV-On-NPU for more information.')
|
27 |
|
28 |
parser = argparse.ArgumentParser(description='Hand Detector from MediaPipe')
|
29 |
+
parser.add_argument('--input', '-i', type=str, help='Usage: Set path to the input image. Omit for using default camera.')
|
30 |
+
parser.add_argument('--model', '-m', type=str, default='./palm_detection_mediapipe_2022may.onnx', help='Usage: Set model path, defaults to palm_detection_mediapipe_2022may.onnx.')
|
31 |
parser.add_argument('--backend', '-b', type=int, default=backends[0], help=help_msg_backends.format(*backends))
|
32 |
parser.add_argument('--target', '-t', type=int, default=targets[0], help=help_msg_targets.format(*targets))
|
33 |
+
parser.add_argument('--score_threshold', type=float, default=0.99, help='Usage: Set the minimum needed confidence for the model to identify a palm, defaults to 0.99. Smaller values may result in faster detection, but will limit accuracy. Filter out faces of confidence < conf_threshold. An empirical score threshold for the quantized model is 0.49.')
|
34 |
+
parser.add_argument('--nms_threshold', type=float, default=0.3, help='Usage: Suppress bounding boxes of iou >= nms_threshold. Default = 0.3.')
|
35 |
+
parser.add_argument('--save', '-s', type=str, default=False, help='Usage: Set “True” to save file with results (i.e. bounding box, confidence level). Invalid in case of camera input. Default will be set to “False”.')
|
36 |
+
parser.add_argument('--vis', '-v', type=str2bool, default=True, help='Usage: Default will be set to “True” and will open a new window to show results. Set to “False” to stop visualizations from being shown. Invalid in case of camera input.')
|
37 |
args = parser.parse_args()
|
38 |
|
39 |
def visualize(image, results, print_results=False, fps=None):
|
models/person_reid_youtureid/README.md
CHANGED
@@ -3,20 +3,25 @@
|
|
3 |
This model is provided by Tencent Youtu Lab [[Credits]](https://github.com/opencv/opencv/blob/394e640909d5d8edf9c1f578f8216d513373698c/samples/dnn/person_reid.py#L6-L11).
|
4 |
|
5 |
Note:
|
|
|
6 |
- Model source: https://github.com/ReID-Team/ReID_extra_testdata
|
7 |
|
8 |
## Demo
|
9 |
|
10 |
Run the following command to try the demo:
|
|
|
11 |
```shell
|
12 |
-
python demo.py --
|
|
|
|
|
|
|
13 |
```
|
14 |
|
15 |
-
|
16 |
|
17 |
All files in this directory are licensed under [Apache 2.0 License](./LICENSE).
|
18 |
|
19 |
## Reference:
|
20 |
|
21 |
- OpenCV DNN Sample: https://github.com/opencv/opencv/blob/4.x/samples/dnn/person_reid.py
|
22 |
-
- Model source: https://github.com/ReID-Team/ReID_extra_testdata
|
|
|
3 |
This model is provided by Tencent Youtu Lab [[Credits]](https://github.com/opencv/opencv/blob/394e640909d5d8edf9c1f578f8216d513373698c/samples/dnn/person_reid.py#L6-L11).
|
4 |
|
5 |
Note:
|
6 |
+
|
7 |
- Model source: https://github.com/ReID-Team/ReID_extra_testdata
|
8 |
|
9 |
## Demo
|
10 |
|
11 |
Run the following command to try the demo:
|
12 |
+
|
13 |
```shell
|
14 |
+
python demo.py --query_dir /path/to/query --gallery_dir /path/to/gallery
|
15 |
+
|
16 |
+
# get help regarding various parameters
|
17 |
+
python demo.py --help
|
18 |
```
|
19 |
|
20 |
+
### License
|
21 |
|
22 |
All files in this directory are licensed under [Apache 2.0 License](./LICENSE).
|
23 |
|
24 |
## Reference:
|
25 |
|
26 |
- OpenCV DNN Sample: https://github.com/opencv/opencv/blob/4.x/samples/dnn/person_reid.py
|
27 |
+
- Model source: https://github.com/ReID-Team/ReID_extra_testdata
|
models/person_reid_youtureid/demo.py
CHANGED
@@ -30,7 +30,7 @@ try:
|
|
30 |
help_msg_backends += "; {:d}: TIMVX"
|
31 |
help_msg_targets += "; {:d}: NPU"
|
32 |
except:
|
33 |
-
print('This version of OpenCV does not support TIM-VX and NPU. Visit https://
|
34 |
|
35 |
parser = argparse.ArgumentParser(
|
36 |
description="ReID baseline models from Tencent Youtu Lab")
|
|
|
30 |
help_msg_backends += "; {:d}: TIMVX"
|
31 |
help_msg_targets += "; {:d}: NPU"
|
32 |
except:
|
33 |
+
print('This version of OpenCV does not support TIM-VX and NPU. Visit https://github.com/opencv/opencv/wiki/TIM-VX-Backend-For-Running-OpenCV-On-NPU for more information.')
|
34 |
|
35 |
parser = argparse.ArgumentParser(
|
36 |
description="ReID baseline models from Tencent Youtu Lab")
|
models/qrcode_wechatqrcode/README.md
CHANGED
@@ -3,17 +3,22 @@
|
|
3 |
WeChatQRCode for detecting and parsing QR Code, contributed by [WeChat Computer Vision Team (WeChatCV)](https://github.com/WeChatCV). Visit [opencv/opencv_contrib/modules/wechat_qrcode](https://github.com/opencv/opencv_contrib/tree/master/modules/wechat_qrcode) for more details.
|
4 |
|
5 |
Notes:
|
|
|
6 |
- Model source: [opencv/opencv_3rdparty:wechat_qrcode_20210119](https://github.com/opencv/opencv_3rdparty/tree/wechat_qrcode_20210119)
|
7 |
- The APIs `cv::wechat_qrcode::WeChatQRCode` (C++) & `cv.wechat_qrcode_WeChatQRCode` (Python) are both designed to run on default backend (OpenCV) and target (CPU) only. Therefore, benchmark results of this model are only available on CPU devices, until the APIs are updated with setting backends and targets.
|
8 |
|
9 |
## Demo
|
10 |
|
11 |
Run the following command to try the demo:
|
|
|
12 |
```shell
|
13 |
# detect on camera input
|
14 |
python demo.py
|
15 |
# detect on an image
|
16 |
python demo.py --input /path/to/image
|
|
|
|
|
|
|
17 |
```
|
18 |
|
19 |
### Example outputs
|
@@ -27,4 +32,4 @@ All files in this directory are licensed under [Apache 2.0 License](./LICENSE).
|
|
27 |
## Reference:
|
28 |
|
29 |
- https://github.com/opencv/opencv_contrib/tree/master/modules/wechat_qrcode
|
30 |
-
- https://github.com/opencv/opencv_3rdparty/tree/wechat_qrcode_20210119
|
|
|
3 |
WeChatQRCode for detecting and parsing QR Code, contributed by [WeChat Computer Vision Team (WeChatCV)](https://github.com/WeChatCV). Visit [opencv/opencv_contrib/modules/wechat_qrcode](https://github.com/opencv/opencv_contrib/tree/master/modules/wechat_qrcode) for more details.
|
4 |
|
5 |
Notes:
|
6 |
+
|
7 |
- Model source: [opencv/opencv_3rdparty:wechat_qrcode_20210119](https://github.com/opencv/opencv_3rdparty/tree/wechat_qrcode_20210119)
|
8 |
- The APIs `cv::wechat_qrcode::WeChatQRCode` (C++) & `cv.wechat_qrcode_WeChatQRCode` (Python) are both designed to run on default backend (OpenCV) and target (CPU) only. Therefore, benchmark results of this model are only available on CPU devices, until the APIs are updated with setting backends and targets.
|
9 |
|
10 |
## Demo
|
11 |
|
12 |
Run the following command to try the demo:
|
13 |
+
|
14 |
```shell
|
15 |
# detect on camera input
|
16 |
python demo.py
|
17 |
# detect on an image
|
18 |
python demo.py --input /path/to/image
|
19 |
+
|
20 |
+
# get help regarding various parameters
|
21 |
+
python demo.py --help
|
22 |
```
|
23 |
|
24 |
### Example outputs
|
|
|
32 |
## Reference:
|
33 |
|
34 |
- https://github.com/opencv/opencv_contrib/tree/master/modules/wechat_qrcode
|
35 |
+
- https://github.com/opencv/opencv_3rdparty/tree/wechat_qrcode_20210119
|
models/qrcode_wechatqrcode/demo.py
CHANGED
@@ -21,13 +21,13 @@ def str2bool(v):
|
|
21 |
|
22 |
parser = argparse.ArgumentParser(
|
23 |
description="WeChat QR code detector for detecting and parsing QR code (https://github.com/opencv/opencv_contrib/tree/master/modules/wechat_qrcode)")
|
24 |
-
parser.add_argument('--input', '-i', type=str, help='
|
25 |
-
parser.add_argument('--detect_prototxt_path', type=str, default='detect_2021sep.prototxt', help='
|
26 |
-
parser.add_argument('--detect_model_path', type=str, default='detect_2021sep.caffemodel', help='
|
27 |
-
parser.add_argument('--sr_prototxt_path', type=str, default='sr_2021sep.prototxt', help='
|
28 |
-
parser.add_argument('--sr_model_path', type=str, default='sr_2021sep.caffemodel', help='
|
29 |
-
parser.add_argument('--save', '-s', type=str2bool, default=False, help='Set
|
30 |
-
parser.add_argument('--vis', '-v', type=str2bool, default=True, help='
|
31 |
args = parser.parse_args()
|
32 |
|
33 |
def visualize(image, res, points, points_color=(0, 255, 0), text_color=(0, 255, 0), fps=None):
|
|
|
21 |
|
22 |
parser = argparse.ArgumentParser(
|
23 |
description="WeChat QR code detector for detecting and parsing QR code (https://github.com/opencv/opencv_contrib/tree/master/modules/wechat_qrcode)")
|
24 |
+
parser.add_argument('--input', '-i', type=str, help='Usage: Set path to the input image. Omit for using default camera.')
|
25 |
+
parser.add_argument('--detect_prototxt_path', type=str, default='detect_2021sep.prototxt', help='Usage: Set path to detect.prototxt.')
|
26 |
+
parser.add_argument('--detect_model_path', type=str, default='detect_2021sep.caffemodel', help='Usage: Set path to detect.caffemodel.')
|
27 |
+
parser.add_argument('--sr_prototxt_path', type=str, default='sr_2021sep.prototxt', help='Usage: Set path to sr.prototxt.')
|
28 |
+
parser.add_argument('--sr_model_path', type=str, default='sr_2021sep.caffemodel', help='Usage: Set path to sr.caffemodel.')
|
29 |
+
parser.add_argument('--save', '-s', type=str2bool, default=False, help='Usage: Set “True” to save file with results (i.e. bounding box, confidence level). Invalid in case of camera input. Default will be set to “False”.')
|
30 |
+
parser.add_argument('--vis', '-v', type=str2bool, default=True, help='Usage: Default will be set to “True” and will open a new window to show results. Set to “False” to stop visualizations from being shown. Invalid in case of camera input.')
|
31 |
args = parser.parse_args()
|
32 |
|
33 |
def visualize(image, res, points, points_color=(0, 255, 0), text_color=(0, 255, 0), fps=None):
|
models/text_detection_db/README.md
CHANGED
@@ -3,6 +3,7 @@
|
|
3 |
Real-time Scene Text Detection with Differentiable Binarization
|
4 |
|
5 |
Note:
|
|
|
6 |
- Models source: [here](https://drive.google.com/drive/folders/1qzNCHfUJOS0NEUOIKn69eCtxdlNPpWbq).
|
7 |
- `IC15` in the filename means the model is trained on [IC15 dataset](https://rrc.cvc.uab.es/?ch=4&com=introduction), which can detect English text instances only.
|
8 |
- `TD500` in the filename means the model is trained on [TD500 dataset](http://www.iapr-tc11.org/mediawiki/index.php/MSRA_Text_Detection_500_Database_(MSRA-TD500)), which can detect both English & Chinese instances.
|
@@ -11,12 +12,17 @@ Note:
|
|
11 |
## Demo
|
12 |
|
13 |
Run the following command to try the demo:
|
|
|
14 |
```shell
|
15 |
# detect on camera input
|
16 |
python demo.py
|
17 |
# detect on an image
|
18 |
python demo.py --input /path/to/image
|
|
|
|
|
|
|
19 |
```
|
|
|
20 |
### Example outputs
|
21 |
|
22 |

|
@@ -31,4 +37,4 @@ All files in this directory are licensed under [Apache 2.0 License](./LICENSE).
|
|
31 |
|
32 |
- https://arxiv.org/abs/1911.08947
|
33 |
- https://github.com/MhLiao/DB
|
34 |
-
- https://docs.opencv.org/master/d4/d43/tutorial_dnn_text_spotting.html
|
|
|
3 |
Real-time Scene Text Detection with Differentiable Binarization
|
4 |
|
5 |
Note:
|
6 |
+
|
7 |
- Models source: [here](https://drive.google.com/drive/folders/1qzNCHfUJOS0NEUOIKn69eCtxdlNPpWbq).
|
8 |
- `IC15` in the filename means the model is trained on [IC15 dataset](https://rrc.cvc.uab.es/?ch=4&com=introduction), which can detect English text instances only.
|
9 |
- `TD500` in the filename means the model is trained on [TD500 dataset](http://www.iapr-tc11.org/mediawiki/index.php/MSRA_Text_Detection_500_Database_(MSRA-TD500)), which can detect both English & Chinese instances.
|
|
|
12 |
## Demo
|
13 |
|
14 |
Run the following command to try the demo:
|
15 |
+
|
16 |
```shell
|
17 |
# detect on camera input
|
18 |
python demo.py
|
19 |
# detect on an image
|
20 |
python demo.py --input /path/to/image
|
21 |
+
|
22 |
+
# get help regarding various parameters
|
23 |
+
python demo.py --help
|
24 |
```
|
25 |
+
|
26 |
### Example outputs
|
27 |
|
28 |

|
|
|
37 |
|
38 |
- https://arxiv.org/abs/1911.08947
|
39 |
- https://github.com/MhLiao/DB
|
40 |
+
- https://docs.opencv.org/master/d4/d43/tutorial_dnn_text_spotting.html
|
models/text_detection_db/demo.py
CHANGED
@@ -29,23 +29,23 @@ try:
|
|
29 |
help_msg_backends += "; {:d}: TIMVX"
|
30 |
help_msg_targets += "; {:d}: NPU"
|
31 |
except:
|
32 |
-
print('This version of OpenCV does not support TIM-VX and NPU. Visit https://
|
33 |
|
34 |
parser = argparse.ArgumentParser(description='Real-time Scene Text Detection with Differentiable Binarization (https://arxiv.org/abs/1911.08947).')
|
35 |
-
parser.add_argument('--input', '-i', type=str, help='
|
36 |
-
parser.add_argument('--model', '-m', type=str, default='text_detection_DB_TD500_resnet18_2021sep.onnx', help='
|
37 |
parser.add_argument('--backend', '-b', type=int, default=backends[0], help=help_msg_backends.format(*backends))
|
38 |
parser.add_argument('--target', '-t', type=int, default=targets[0], help=help_msg_targets.format(*targets))
|
39 |
parser.add_argument('--width', type=int, default=736,
|
40 |
-
help='
|
41 |
parser.add_argument('--height', type=int, default=736,
|
42 |
-
help='
|
43 |
-
parser.add_argument('--binary_threshold', type=float, default=0.3, help='Threshold of the binary map.')
|
44 |
-
parser.add_argument('--polygon_threshold', type=float, default=0.5, help='Threshold of polygons.')
|
45 |
-
parser.add_argument('--max_candidates', type=int, default=200, help='
|
46 |
-
parser.add_argument('--unclip_ratio', type=np.float64, default=2.0, help=' The unclip ratio of the detected text region, which determines the output size.')
|
47 |
-
parser.add_argument('--save', '-s', type=str, default=False, help='Set
|
48 |
-
parser.add_argument('--vis', '-v', type=str2bool, default=True, help='
|
49 |
args = parser.parse_args()
|
50 |
|
51 |
def visualize(image, results, box_color=(0, 255, 0), text_color=(0, 0, 255), isClosed=True, thickness=2, fps=None):
|
|
|
29 |
help_msg_backends += "; {:d}: TIMVX"
|
30 |
help_msg_targets += "; {:d}: NPU"
|
31 |
except:
|
32 |
+
print('This version of OpenCV does not support TIM-VX and NPU. Visit https://github.com/opencv/opencv/wiki/TIM-VX-Backend-For-Running-OpenCV-On-NPU for more information.')
|
33 |
|
34 |
parser = argparse.ArgumentParser(description='Real-time Scene Text Detection with Differentiable Binarization (https://arxiv.org/abs/1911.08947).')
|
35 |
+
parser.add_argument('--input', '-i', type=str, help='Usage: Set path to the input image. Omit for using default camera.')
|
36 |
+
parser.add_argument('--model', '-m', type=str, default='text_detection_DB_TD500_resnet18_2021sep.onnx', help='Usage: Set model path, defaults to text_detection_DB_TD500_resnet18_2021sep.onnx.')
|
37 |
parser.add_argument('--backend', '-b', type=int, default=backends[0], help=help_msg_backends.format(*backends))
|
38 |
parser.add_argument('--target', '-t', type=int, default=targets[0], help=help_msg_targets.format(*targets))
|
39 |
parser.add_argument('--width', type=int, default=736,
|
40 |
+
help='Usage: Resize input image to certain width, default = 736. It should be multiple by 32.')
|
41 |
parser.add_argument('--height', type=int, default=736,
|
42 |
+
help='Usage: Resize input image to certain height, default = 736. It should be multiple by 32.')
|
43 |
+
parser.add_argument('--binary_threshold', type=float, default=0.3, help='Usage: Threshold of the binary map, default = 0.3.')
|
44 |
+
parser.add_argument('--polygon_threshold', type=float, default=0.5, help='Usage: Threshold of polygons, default = 0.5.')
|
45 |
+
parser.add_argument('--max_candidates', type=int, default=200, help='Usage: Set maximum number of polygon candidates, default = 200.')
|
46 |
+
parser.add_argument('--unclip_ratio', type=np.float64, default=2.0, help=' Usage: The unclip ratio of the detected text region, which determines the output size, default = 2.0.')
|
47 |
+
parser.add_argument('--save', '-s', type=str, default=False, help='Usage: Set “True” to save file with results (i.e. bounding box, confidence level). Invalid in case of camera input. Default will be set to “False”.')
|
48 |
+
parser.add_argument('--vis', '-v', type=str2bool, default=True, help='Usage: Default will be set to “True” and will open a new window to show results. Set to “False” to stop visualizations from being shown. Invalid in case of camera input.')
|
49 |
args = parser.parse_args()
|
50 |
|
51 |
def visualize(image, results, box_color=(0, 255, 0), text_color=(0, 0, 255), isClosed=True, thickness=2, fps=None):
|
models/text_recognition_crnn/README.md
CHANGED
@@ -5,7 +5,7 @@ An End-to-End Trainable Neural Network for Image-based Sequence Recognition and
|
|
5 |
Results of accuracy evaluation with [tools/eval](../../tools/eval) at different text recognition datasets.
|
6 |
|
7 |
| Model name | ICDAR03(%) | IIIT5k(%) | CUTE80(%) |
|
8 |
-
|
9 |
| CRNN_EN | 81.66 | 74.33 | 52.78 |
|
10 |
| CRNN_EN_FP16 | 82.01 | 74.93 | 52.34 |
|
11 |
| CRNN_EN_INT8 | 81.75 | 75.33 | 52.43 |
|
@@ -16,10 +16,11 @@ Results of accuracy evaluation with [tools/eval](../../tools/eval) at different
|
|
16 |
\*: 'FP16' or 'INT8' stands for 'model quantized into FP16' or 'model quantized into int8'
|
17 |
|
18 |
Note:
|
|
|
19 |
- Model source:
|
20 |
-
|
21 |
-
|
22 |
-
|
23 |
- `text_recognition_CRNN_EN_2021sep.onnx` can detect digits (0\~9) and letters (return lowercase letters a\~z) (view `charset_36_EN.txt` for details).
|
24 |
- `text_recognition_CRNN_CH_2021sep.onnx` can detect digits (0\~9), upper/lower-case letters (a\~z and A\~Z), and some special characters (view `charset_94_CH.txt` for details).
|
25 |
- `text_recognition_CRNN_CN_2021nov.onnx` can detect digits (0\~9), upper/lower-case letters (a\~z and A\~Z), some Chinese characters and some special characters (view `charset_3944_CN.txt` for details).
|
@@ -28,26 +29,35 @@ Note:
|
|
28 |
## Demo
|
29 |
|
30 |
***NOTE***:
|
|
|
31 |
- This demo uses [text_detection_db](../text_detection_db) as text detector.
|
32 |
- Selected model must match with the charset:
|
33 |
-
|
34 |
-
|
35 |
-
|
36 |
|
37 |
Run the demo detecting English:
|
|
|
38 |
```shell
|
39 |
# detect on camera input
|
40 |
python demo.py
|
41 |
# detect on an image
|
42 |
python demo.py --input /path/to/image
|
|
|
|
|
|
|
43 |
```
|
44 |
|
45 |
Run the demo detecting Chinese:
|
|
|
46 |
```shell
|
47 |
# detect on camera input
|
48 |
python demo.py --model text_recognition_CRNN_CN_2021nov.onnx --charset charset_3944_CN.txt
|
49 |
# detect on an image
|
50 |
python demo.py --input /path/to/image --model text_recognition_CRNN_CN_2021nov.onnx --charset charset_3944_CN.txt
|
|
|
|
|
|
|
51 |
```
|
52 |
|
53 |
### Examples
|
|
|
5 |
Results of accuracy evaluation with [tools/eval](../../tools/eval) at different text recognition datasets.
|
6 |
|
7 |
| Model name | ICDAR03(%) | IIIT5k(%) | CUTE80(%) |
|
8 |
+
| ------------ | ---------- | --------- | --------- |
|
9 |
| CRNN_EN | 81.66 | 74.33 | 52.78 |
|
10 |
| CRNN_EN_FP16 | 82.01 | 74.93 | 52.34 |
|
11 |
| CRNN_EN_INT8 | 81.75 | 75.33 | 52.43 |
|
|
|
16 |
\*: 'FP16' or 'INT8' stands for 'model quantized into FP16' or 'model quantized into int8'
|
17 |
|
18 |
Note:
|
19 |
+
|
20 |
- Model source:
|
21 |
+
- `text_recognition_CRNN_EN_2021sep.onnx`: https://docs.opencv.org/4.5.2/d9/d1e/tutorial_dnn_OCR.html (CRNN_VGG_BiLSTM_CTC.onnx)
|
22 |
+
- `text_recognition_CRNN_CH_2021sep.onnx`: https://docs.opencv.org/4.x/d4/d43/tutorial_dnn_text_spotting.html (crnn_cs.onnx)
|
23 |
+
- `text_recognition_CRNN_CN_2021nov.onnx`: https://docs.opencv.org/4.5.2/d4/d43/tutorial_dnn_text_spotting.html (crnn_cs_CN.onnx)
|
24 |
- `text_recognition_CRNN_EN_2021sep.onnx` can detect digits (0\~9) and letters (return lowercase letters a\~z) (view `charset_36_EN.txt` for details).
|
25 |
- `text_recognition_CRNN_CH_2021sep.onnx` can detect digits (0\~9), upper/lower-case letters (a\~z and A\~Z), and some special characters (view `charset_94_CH.txt` for details).
|
26 |
- `text_recognition_CRNN_CN_2021nov.onnx` can detect digits (0\~9), upper/lower-case letters (a\~z and A\~Z), some Chinese characters and some special characters (view `charset_3944_CN.txt` for details).
|
|
|
29 |
## Demo
|
30 |
|
31 |
***NOTE***:
|
32 |
+
|
33 |
- This demo uses [text_detection_db](../text_detection_db) as text detector.
|
34 |
- Selected model must match with the charset:
|
35 |
+
- Try `text_recognition_CRNN_EN_2021sep.onnx` with `charset_36_EN.txt`.
|
36 |
+
- Try `text_recognition_CRNN_CH_2021sep.onnx` with `charset_94_CH.txt`
|
37 |
+
- Try `text_recognition_CRNN_CN_2021sep.onnx` with `charset_3944_CN.txt`.
|
38 |
|
39 |
Run the demo detecting English:
|
40 |
+
|
41 |
```shell
|
42 |
# detect on camera input
|
43 |
python demo.py
|
44 |
# detect on an image
|
45 |
python demo.py --input /path/to/image
|
46 |
+
|
47 |
+
# get help regarding various parameters
|
48 |
+
python demo.py --help
|
49 |
```
|
50 |
|
51 |
Run the demo detecting Chinese:
|
52 |
+
|
53 |
```shell
|
54 |
# detect on camera input
|
55 |
python demo.py --model text_recognition_CRNN_CN_2021nov.onnx --charset charset_3944_CN.txt
|
56 |
# detect on an image
|
57 |
python demo.py --input /path/to/image --model text_recognition_CRNN_CN_2021nov.onnx --charset charset_3944_CN.txt
|
58 |
+
|
59 |
+
# get help regarding various parameters
|
60 |
+
python demo.py --help
|
61 |
```
|
62 |
|
63 |
### Examples
|
models/text_recognition_crnn/demo.py
CHANGED
@@ -33,17 +33,17 @@ try:
|
|
33 |
help_msg_backends += "; {:d}: TIMVX"
|
34 |
help_msg_targets += "; {:d}: NPU"
|
35 |
except:
|
36 |
-
print('This version of OpenCV does not support TIM-VX and NPU. Visit https://
|
37 |
|
38 |
parser = argparse.ArgumentParser(
|
39 |
description="An End-to-End Trainable Neural Network for Image-based Sequence Recognition and Its Application to Scene Text Recognition (https://arxiv.org/abs/1507.05717)")
|
40 |
-
parser.add_argument('--input', '-i', type=str, help='
|
41 |
-
parser.add_argument('--model', '-m', type=str, default='text_recognition_CRNN_EN_2021sep.onnx', help='
|
42 |
parser.add_argument('--backend', '-b', type=int, default=backends[0], help=help_msg_backends.format(*backends))
|
43 |
parser.add_argument('--target', '-t', type=int, default=targets[0], help=help_msg_targets.format(*targets))
|
44 |
-
parser.add_argument('--charset', '-c', type=str, default='charset_36_EN.txt', help='
|
45 |
-
parser.add_argument('--save', '-s', type=str, default=False, help='Set
|
46 |
-
parser.add_argument('--vis', '-v', type=str2bool, default=True, help='
|
47 |
parser.add_argument('--width', type=int, default=736,
|
48 |
help='Preprocess input image by resizing to a specific width. It should be multiple by 32.')
|
49 |
parser.add_argument('--height', type=int, default=736,
|
|
|
33 |
help_msg_backends += "; {:d}: TIMVX"
|
34 |
help_msg_targets += "; {:d}: NPU"
|
35 |
except:
|
36 |
+
print('This version of OpenCV does not support TIM-VX and NPU. Visit https://github.com/opencv/opencv/wiki/TIM-VX-Backend-For-Running-OpenCV-On-NPU for more information.')
|
37 |
|
38 |
parser = argparse.ArgumentParser(
|
39 |
description="An End-to-End Trainable Neural Network for Image-based Sequence Recognition and Its Application to Scene Text Recognition (https://arxiv.org/abs/1507.05717)")
|
40 |
+
parser.add_argument('--input', '-i', type=str, help='Usage: Set path to the input image. Omit for using default camera.')
|
41 |
+
parser.add_argument('--model', '-m', type=str, default='text_recognition_CRNN_EN_2021sep.onnx', help='Usage: Set model path, defaults to text_recognition_CRNN_EN_2021sep.onnx.')
|
42 |
parser.add_argument('--backend', '-b', type=int, default=backends[0], help=help_msg_backends.format(*backends))
|
43 |
parser.add_argument('--target', '-t', type=int, default=targets[0], help=help_msg_targets.format(*targets))
|
44 |
+
parser.add_argument('--charset', '-c', type=str, default='charset_36_EN.txt', help='Usage: Set the path to the charset file corresponding to the selected model.')
|
45 |
+
parser.add_argument('--save', '-s', type=str, default=False, help='Usage: Set “True” to save a file with results. Invalid in case of camera input. Default will be set to “False”.')
|
46 |
+
parser.add_argument('--vis', '-v', type=str2bool, default=True, help='Usage: Default will be set to “True” and will open a new window to show results. Set to “False” to stop visualizations from being shown. Invalid in case of camera input.')
|
47 |
parser.add_argument('--width', type=int, default=736,
|
48 |
help='Preprocess input image by resizing to a specific width. It should be multiple by 32.')
|
49 |
parser.add_argument('--height', type=int, default=736,
|