Satyam Goyal commited on
Commit
e7d94f5
·
1 Parent(s): 87cd14e

Merge pull request #95 from Satgoy152:adding-doc

Browse files

Improved help messages for demo programs (#95)
- Added Demo Documentation
- Updated help messages
- Changed exception link

README.md CHANGED
@@ -3,19 +3,20 @@
3
  A zoo for models tuned for OpenCV DNN with benchmarks on different platforms.
4
 
5
  Guidelines:
 
6
  - Clone this repo to download all models and demo scripts:
7
- ```shell
8
- # Install git-lfs from https://git-lfs.github.com/
9
- git clone https://github.com/opencv/opencv_zoo && cd opencv_zoo
10
- git lfs install
11
- git lfs pull
12
- ```
13
  - To run benchmarks on your hardware settings, please refer to [benchmark/README](./benchmark/README.md).
14
 
15
  ## Models & Benchmark Results
16
 
17
- | Model | Task | Input Size | INTEL-CPU (ms) | RPI-CPU (ms) | JETSON-GPU (ms) | KV3-NPU (ms) | D1-CPU (ms) |
18
- |---------------------------------------------------------|-------------------------------|------------|----------------|--------------|-----------------|--------------|-------------|
19
  | [YuNet](./models/face_detection_yunet) | Face Detection | 160x120 | 1.45 | 6.22 | 12.18 | 4.04 | 86.69 |
20
  | [SFace](./models/face_recognition_sface) | Face Recognition | 112x112 | 8.65 | 99.20 | 24.88 | 46.25 | --- |
21
  | [LPD-YuNet](./models/license_plate_detection_yunet/) | License Plate Detection | 320x240 | --- | 168.03 | 56.12 | 154.20\* | |
@@ -36,6 +37,7 @@ Guidelines:
36
  \*: Models are quantized in per-channel mode, which run slower than per-tensor quantized models on NPU.
37
 
38
  Hardware Setup:
 
39
  - `INTEL-CPU`: [Intel Core i7-5930K](https://www.intel.com/content/www/us/en/products/sku/82931/intel-core-i75930k-processor-15m-cache-up-to-3-70-ghz/specifications.html) @ 3.50GHz, 6 cores, 12 threads.
40
  - `RPI-CPU`: [Raspberry Pi 4B](https://www.raspberrypi.com/products/raspberry-pi-4-model-b/specifications/), Broadcom BCM2711, Quad core Cortex-A72 (ARM v8) 64-bit SoC @ 1.5GHz.
41
  - `JETSON-GPU`: [NVIDIA Jetson Nano B01](https://developer.nvidia.com/embedded/jetson-nano-developer-kit), 128-core NVIDIA Maxwell GPU.
@@ -43,6 +45,7 @@ Hardware Setup:
43
  - `D1-CPU`: [Allwinner D1](https://d1.docs.aw-ol.com/en), [Xuantie C906 CPU](https://www.t-head.cn/product/C906?spm=a2ouz.12986968.0.0.7bfc1384auGNPZ) (RISC-V, RVV 0.7.1) @ 1.0GHz, 1 core. YuNet is supported for now. Visit [here](https://github.com/fengyuentau/opencv_zoo_cpp) for more details.
44
 
45
  ***Important Notes***:
 
46
  - The data under each column of hardware setups on the above table represents the elapsed time of an inference (preprocess, forward and postprocess).
47
  - The time data is the median of 10 runs after some warmup runs. Different metrics may be applied to some specific models.
48
  - Batch size is 1 for all benchmark results.
@@ -52,6 +55,7 @@ Hardware Setup:
52
  ## Some Examples
53
 
54
  Some examples are listed below. You can find more in the directory of each model!
 
55
  ### Face Detection with [YuNet](./models/face_detection_yunet/)
56
 
57
  ![largest selfie](./models/face_detection_yunet/examples/largest_selfie.jpg)
 
3
  A zoo for models tuned for OpenCV DNN with benchmarks on different platforms.
4
 
5
  Guidelines:
6
+
7
  - Clone this repo to download all models and demo scripts:
8
+ ```shell
9
+ # Install git-lfs from https://git-lfs.github.com/
10
+ git clone https://github.com/opencv/opencv_zoo && cd opencv_zoo
11
+ git lfs install
12
+ git lfs pull
13
+ ```
14
  - To run benchmarks on your hardware settings, please refer to [benchmark/README](./benchmark/README.md).
15
 
16
  ## Models & Benchmark Results
17
 
18
+ | Model | Task | Input Size | INTEL-CPU (ms) | RPI-CPU (ms) | JETSON-GPU (ms) | KV3-NPU (ms) | D1-CPU (ms) |
19
+ | ---------------------------------------------------- | ----------------------------- | ---------- | -------------- | ------------ | --------------- | ------------ | ----------- |
20
  | [YuNet](./models/face_detection_yunet) | Face Detection | 160x120 | 1.45 | 6.22 | 12.18 | 4.04 | 86.69 |
21
  | [SFace](./models/face_recognition_sface) | Face Recognition | 112x112 | 8.65 | 99.20 | 24.88 | 46.25 | --- |
22
  | [LPD-YuNet](./models/license_plate_detection_yunet/) | License Plate Detection | 320x240 | --- | 168.03 | 56.12 | 154.20\* | |
 
37
  \*: Models are quantized in per-channel mode, which run slower than per-tensor quantized models on NPU.
38
 
39
  Hardware Setup:
40
+
41
  - `INTEL-CPU`: [Intel Core i7-5930K](https://www.intel.com/content/www/us/en/products/sku/82931/intel-core-i75930k-processor-15m-cache-up-to-3-70-ghz/specifications.html) @ 3.50GHz, 6 cores, 12 threads.
42
  - `RPI-CPU`: [Raspberry Pi 4B](https://www.raspberrypi.com/products/raspberry-pi-4-model-b/specifications/), Broadcom BCM2711, Quad core Cortex-A72 (ARM v8) 64-bit SoC @ 1.5GHz.
43
  - `JETSON-GPU`: [NVIDIA Jetson Nano B01](https://developer.nvidia.com/embedded/jetson-nano-developer-kit), 128-core NVIDIA Maxwell GPU.
 
45
  - `D1-CPU`: [Allwinner D1](https://d1.docs.aw-ol.com/en), [Xuantie C906 CPU](https://www.t-head.cn/product/C906?spm=a2ouz.12986968.0.0.7bfc1384auGNPZ) (RISC-V, RVV 0.7.1) @ 1.0GHz, 1 core. YuNet is supported for now. Visit [here](https://github.com/fengyuentau/opencv_zoo_cpp) for more details.
46
 
47
  ***Important Notes***:
48
+
49
  - The data under each column of hardware setups on the above table represents the elapsed time of an inference (preprocess, forward and postprocess).
50
  - The time data is the median of 10 runs after some warmup runs. Different metrics may be applied to some specific models.
51
  - Batch size is 1 for all benchmark results.
 
55
  ## Some Examples
56
 
57
  Some examples are listed below. You can find more in the directory of each model!
58
+
59
  ### Face Detection with [YuNet](./models/face_detection_yunet/)
60
 
61
  ![largest selfie](./models/face_detection_yunet/examples/largest_selfie.jpg)
models/face_detection_yunet/README.md CHANGED
@@ -3,14 +3,15 @@
3
  YuNet is a light-weight, fast and accurate face detection model, which achieves 0.834(AP_easy), 0.824(AP_medium), 0.708(AP_hard) on the WIDER Face validation set.
4
 
5
  Notes:
 
6
  - Model source: [here](https://github.com/ShiqiYu/libfacedetection.train/blob/a61a428929148171b488f024b5d6774f93cdbc13/tasks/task1/onnx/yunet.onnx).
7
  - For details on training this model, please visit https://github.com/ShiqiYu/libfacedetection.train.
8
  - This ONNX model has fixed input shape, but OpenCV DNN infers on the exact shape of input image. See https://github.com/opencv/opencv_zoo/issues/44 for more information.
9
 
10
  Results of accuracy evaluation with [tools/eval](../../tools/eval).
11
 
12
- | Models | Easy AP | Medium AP | Hard AP |
13
- |-------------|---------|-----------|---------|
14
  | YuNet | 0.8498 | 0.8384 | 0.7357 |
15
  | YuNet quant | 0.7751 | 0.8145 | 0.7312 |
16
 
@@ -19,11 +20,15 @@ Results of accuracy evaluation with [tools/eval](../../tools/eval).
19
  ## Demo
20
 
21
  Run the following command to try the demo:
 
22
  ```shell
23
  # detect on camera input
24
  python demo.py
25
  # detect on an image
26
  python demo.py --input /path/to/image
 
 
 
27
  ```
28
 
29
  ### Example outputs
 
3
  YuNet is a light-weight, fast and accurate face detection model, which achieves 0.834(AP_easy), 0.824(AP_medium), 0.708(AP_hard) on the WIDER Face validation set.
4
 
5
  Notes:
6
+
7
  - Model source: [here](https://github.com/ShiqiYu/libfacedetection.train/blob/a61a428929148171b488f024b5d6774f93cdbc13/tasks/task1/onnx/yunet.onnx).
8
  - For details on training this model, please visit https://github.com/ShiqiYu/libfacedetection.train.
9
  - This ONNX model has fixed input shape, but OpenCV DNN infers on the exact shape of input image. See https://github.com/opencv/opencv_zoo/issues/44 for more information.
10
 
11
  Results of accuracy evaluation with [tools/eval](../../tools/eval).
12
 
13
+ | Models | Easy AP | Medium AP | Hard AP |
14
+ | ----------- | ------- | --------- | ------- |
15
  | YuNet | 0.8498 | 0.8384 | 0.7357 |
16
  | YuNet quant | 0.7751 | 0.8145 | 0.7312 |
17
 
 
20
  ## Demo
21
 
22
  Run the following command to try the demo:
23
+
24
  ```shell
25
  # detect on camera input
26
  python demo.py
27
  # detect on an image
28
  python demo.py --input /path/to/image
29
+
30
+ # get help regarding various parameters
31
+ python demo.py --help
32
  ```
33
 
34
  ### Example outputs
models/face_detection_yunet/demo.py CHANGED
@@ -22,25 +22,25 @@ def str2bool(v):
22
  backends = [cv.dnn.DNN_BACKEND_OPENCV, cv.dnn.DNN_BACKEND_CUDA]
23
  targets = [cv.dnn.DNN_TARGET_CPU, cv.dnn.DNN_TARGET_CUDA, cv.dnn.DNN_TARGET_CUDA_FP16]
24
  help_msg_backends = "Choose one of the computation backends: {:d}: OpenCV implementation (default); {:d}: CUDA"
25
- help_msg_targets = "Chose one of the target computation devices: {:d}: CPU (default); {:d}: CUDA; {:d}: CUDA fp16"
26
  try:
27
  backends += [cv.dnn.DNN_BACKEND_TIMVX]
28
  targets += [cv.dnn.DNN_TARGET_NPU]
29
  help_msg_backends += "; {:d}: TIMVX"
30
  help_msg_targets += "; {:d}: NPU"
31
  except:
32
- print('This version of OpenCV does not support TIM-VX and NPU. Visit https://gist.github.com/fengyuentau/5a7a5ba36328f2b763aea026c43fa45f for more information.')
33
 
34
  parser = argparse.ArgumentParser(description='YuNet: A Fast and Accurate CNN-based Face Detector (https://github.com/ShiqiYu/libfacedetection).')
35
- parser.add_argument('--input', '-i', type=str, help='Path to the input image. Omit for using default camera.')
36
- parser.add_argument('--model', '-m', type=str, default='face_detection_yunet_2022mar.onnx', help='Path to the model.')
37
  parser.add_argument('--backend', '-b', type=int, default=backends[0], help=help_msg_backends.format(*backends))
38
  parser.add_argument('--target', '-t', type=int, default=targets[0], help=help_msg_targets.format(*targets))
39
- parser.add_argument('--conf_threshold', type=float, default=0.9, help='Filter out faces of confidence < conf_threshold.')
40
- parser.add_argument('--nms_threshold', type=float, default=0.3, help='Suppress bounding boxes of iou >= nms_threshold.')
41
- parser.add_argument('--top_k', type=int, default=5000, help='Keep top_k bounding boxes before NMS.')
42
- parser.add_argument('--save', '-s', type=str, default=False, help='Set true to save results. This flag is invalid when using camera.')
43
- parser.add_argument('--vis', '-v', type=str2bool, default=True, help='Set true to open a window for result visualization. This flag is invalid when using camera.')
44
  args = parser.parse_args()
45
 
46
  def visualize(image, results, box_color=(0, 255, 0), text_color=(0, 0, 255), fps=None):
 
22
  backends = [cv.dnn.DNN_BACKEND_OPENCV, cv.dnn.DNN_BACKEND_CUDA]
23
  targets = [cv.dnn.DNN_TARGET_CPU, cv.dnn.DNN_TARGET_CUDA, cv.dnn.DNN_TARGET_CUDA_FP16]
24
  help_msg_backends = "Choose one of the computation backends: {:d}: OpenCV implementation (default); {:d}: CUDA"
25
+ help_msg_targets = "Choose one of the target computation devices: {:d}: CPU (default); {:d}: CUDA; {:d}: CUDA fp16"
26
  try:
27
  backends += [cv.dnn.DNN_BACKEND_TIMVX]
28
  targets += [cv.dnn.DNN_TARGET_NPU]
29
  help_msg_backends += "; {:d}: TIMVX"
30
  help_msg_targets += "; {:d}: NPU"
31
  except:
32
+ print('This version of OpenCV does not support TIM-VX and NPU. Visit https://github.com/opencv/opencv/wiki/TIM-VX-Backend-For-Running-OpenCV-On-NPU for more information.')
33
 
34
  parser = argparse.ArgumentParser(description='YuNet: A Fast and Accurate CNN-based Face Detector (https://github.com/ShiqiYu/libfacedetection).')
35
+ parser.add_argument('--input', '-i', type=str, help='Usage: Set input to a certain image, omit if using camera.')
36
+ parser.add_argument('--model', '-m', type=str, default='face_detection_yunet_2022mar.onnx', help="Usage: Set model type, defaults to 'face_detection_yunet_2022mar.onnx'.")
37
  parser.add_argument('--backend', '-b', type=int, default=backends[0], help=help_msg_backends.format(*backends))
38
  parser.add_argument('--target', '-t', type=int, default=targets[0], help=help_msg_targets.format(*targets))
39
+ parser.add_argument('--conf_threshold', type=float, default=0.9, help='Usage: Set the minimum needed confidence for the model to identify a face, defauts to 0.9. Smaller values may result in faster detection, but will limit accuracy. Filter out faces of confidence < conf_threshold.')
40
+ parser.add_argument('--nms_threshold', type=float, default=0.3, help='Usage: Suppress bounding boxes of iou >= nms_threshold. Default = 0.3.')
41
+ parser.add_argument('--top_k', type=int, default=5000, help='Usage: Keep top_k bounding boxes before NMS.')
42
+ parser.add_argument('--save', '-s', type=str, default=False, help='Usage: Set “True” to save file with results (i.e. bounding box, confidence level). Invalid in case of camera input. Default will be set to “False”.')
43
+ parser.add_argument('--vis', '-v', type=str2bool, default=True, help='Usage: Default will be set to “True” and will open a new window to show results. Set to “False” to stop visualizations from being shown. Invalid in case of camera input.')
44
  args = parser.parse_args()
45
 
46
  def visualize(image, results, box_color=(0, 255, 0), text_color=(0, 0, 255), fps=None):
models/face_recognition_sface/README.md CHANGED
@@ -3,30 +3,33 @@
3
  SFace: Sigmoid-Constrained Hypersphere Loss for Robust Face Recognition
4
 
5
  Note:
 
6
  - SFace is contributed by [Yaoyao Zhong](https://github.com/zhongyy/SFace).
7
  - [face_recognition_sface_2021sep.onnx](./face_recognition_sface_2021sep.onnx) is converted from the model from https://github.com/zhongyy/SFace thanks to [Chengrui Wang](https://github.com/crywang).
8
  - Support 5-landmark warpping for now (2021sep)
9
 
10
  Results of accuracy evaluation with [tools/eval](../../tools/eval).
11
 
12
- | Models | Accuracy |
13
- |-------------|----------|
14
  | SFace | 0.9940 |
15
  | SFace quant | 0.9932 |
16
 
17
  \*: 'quant' stands for 'quantized'.
18
 
19
-
20
  ## Demo
21
 
22
  ***NOTE***: This demo uses [../face_detection_yunet](../face_detection_yunet) as face detector, which supports 5-landmark detection for now (2021sep).
23
 
24
  Run the following command to try the demo:
 
25
  ```shell
26
  # recognize on images
27
  python demo.py --input1 /path/to/image1 --input2 /path/to/image2
28
- ```
29
 
 
 
 
30
 
31
  ## License
32
 
@@ -35,4 +38,4 @@ All files in this directory are licensed under [Apache 2.0 License](./LICENSE).
35
  ## Reference
36
 
37
  - https://ieeexplore.ieee.org/document/9318547
38
- - https://github.com/zhongyy/SFace
 
3
  SFace: Sigmoid-Constrained Hypersphere Loss for Robust Face Recognition
4
 
5
  Note:
6
+
7
  - SFace is contributed by [Yaoyao Zhong](https://github.com/zhongyy/SFace).
8
  - [face_recognition_sface_2021sep.onnx](./face_recognition_sface_2021sep.onnx) is converted from the model from https://github.com/zhongyy/SFace thanks to [Chengrui Wang](https://github.com/crywang).
9
  - Support 5-landmark warpping for now (2021sep)
10
 
11
  Results of accuracy evaluation with [tools/eval](../../tools/eval).
12
 
13
+ | Models | Accuracy |
14
+ | ----------- | -------- |
15
  | SFace | 0.9940 |
16
  | SFace quant | 0.9932 |
17
 
18
  \*: 'quant' stands for 'quantized'.
19
 
 
20
  ## Demo
21
 
22
  ***NOTE***: This demo uses [../face_detection_yunet](../face_detection_yunet) as face detector, which supports 5-landmark detection for now (2021sep).
23
 
24
  Run the following command to try the demo:
25
+
26
  ```shell
27
  # recognize on images
28
  python demo.py --input1 /path/to/image1 --input2 /path/to/image2
 
29
 
30
+ # get help regarding various parameters
31
+ python demo.py --help
32
+ ```
33
 
34
  ## License
35
 
 
38
  ## Reference
39
 
40
  - https://ieeexplore.ieee.org/document/9318547
41
+ - https://github.com/zhongyy/SFace
models/face_recognition_sface/demo.py CHANGED
@@ -25,7 +25,7 @@ def str2bool(v):
25
 
26
  backends = [cv.dnn.DNN_BACKEND_OPENCV, cv.dnn.DNN_BACKEND_CUDA]
27
  targets = [cv.dnn.DNN_TARGET_CPU, cv.dnn.DNN_TARGET_CUDA, cv.dnn.DNN_TARGET_CUDA_FP16]
28
- help_msg_backends = "Choose one of the computation backends: {:d}: OpenCV implementation (default); {:d}: CUDA"
29
  help_msg_targets = "Chose one of the target computation devices: {:d}: CPU (default); {:d}: CUDA; {:d}: CUDA fp16"
30
  try:
31
  backends += [cv.dnn.DNN_BACKEND_TIMVX]
@@ -33,18 +33,18 @@ try:
33
  help_msg_backends += "; {:d}: TIMVX"
34
  help_msg_targets += "; {:d}: NPU"
35
  except:
36
- print('This version of OpenCV does not support TIM-VX and NPU. Visit https://gist.github.com/fengyuentau/5a7a5ba36328f2b763aea026c43fa45f for more information.')
37
 
38
  parser = argparse.ArgumentParser(
39
  description="SFace: Sigmoid-Constrained Hypersphere Loss for Robust Face Recognition (https://ieeexplore.ieee.org/document/9318547)")
40
- parser.add_argument('--input1', '-i1', type=str, help='Path to the input image 1.')
41
- parser.add_argument('--input2', '-i2', type=str, help='Path to the input image 2.')
42
- parser.add_argument('--model', '-m', type=str, default='face_recognition_sface_2021dec.onnx', help='Path to the model.')
43
  parser.add_argument('--backend', '-b', type=int, default=backends[0], help=help_msg_backends.format(*backends))
44
  parser.add_argument('--target', '-t', type=int, default=targets[0], help=help_msg_targets.format(*targets))
45
- parser.add_argument('--dis_type', type=int, choices=[0, 1], default=0, help='Distance type. \'0\': cosine, \'1\': norm_l1.')
46
- parser.add_argument('--save', '-s', type=str, default=False, help='Set true to save results. This flag is invalid when using camera.')
47
- parser.add_argument('--vis', '-v', type=str2bool, default=True, help='Set true to open a window for result visualization. This flag is invalid when using camera.')
48
  args = parser.parse_args()
49
 
50
  if __name__ == '__main__':
 
25
 
26
  backends = [cv.dnn.DNN_BACKEND_OPENCV, cv.dnn.DNN_BACKEND_CUDA]
27
  targets = [cv.dnn.DNN_TARGET_CPU, cv.dnn.DNN_TARGET_CUDA, cv.dnn.DNN_TARGET_CUDA_FP16]
28
+ help_msg_backends = "Choose one of the computation backends: {:d}: OpenCV implementation (default); {:d}: CUDA \n Usage: Set backend DNN model, defaults to cv.dnn.DNN_BACKEND_OPENCV (int = 0). Based on your OpenCV version, it may or may not support cv.dnn.DNN_BACKEND_TIMVX. More details: [https://gist.github.com/fengyuentau/5a7a5ba36328f2b763aea026c43fa45f]"
29
  help_msg_targets = "Chose one of the target computation devices: {:d}: CPU (default); {:d}: CUDA; {:d}: CUDA fp16"
30
  try:
31
  backends += [cv.dnn.DNN_BACKEND_TIMVX]
 
33
  help_msg_backends += "; {:d}: TIMVX"
34
  help_msg_targets += "; {:d}: NPU"
35
  except:
36
+ print('This version of OpenCV does not support TIM-VX and NPU. Visit https://github.com/opencv/opencv/wiki/TIM-VX-Backend-For-Running-OpenCV-On-NPU for more information.')
37
 
38
  parser = argparse.ArgumentParser(
39
  description="SFace: Sigmoid-Constrained Hypersphere Loss for Robust Face Recognition (https://ieeexplore.ieee.org/document/9318547)")
40
+ parser.add_argument('--input1', '-i1', type=str, help='Usage: Set path to the input image 1 (original face).')
41
+ parser.add_argument('--input2', '-i2', type=str, help='Usage: Set path to the input image 2 (comparison face).')
42
+ parser.add_argument('--model', '-m', type=str, default='face_recognition_sface_2021dec.onnx', help='Usage: Set model path, defaults to face_recognition_sface_2021dec.onnx.')
43
  parser.add_argument('--backend', '-b', type=int, default=backends[0], help=help_msg_backends.format(*backends))
44
  parser.add_argument('--target', '-t', type=int, default=targets[0], help=help_msg_targets.format(*targets))
45
+ parser.add_argument('--dis_type', type=int, choices=[0, 1], default=0, help='Usage: Distance type. \'0\': cosine, \'1\': norm_l1. Defaults to \'0\'')
46
+ parser.add_argument('--save', '-s', type=str, default=False, help='Usage: Set “True” to save file with results (i.e. bounding box, confidence level). Invalid in case of camera input. Default will be set to “False”.')
47
+ parser.add_argument('--vis', '-v', type=str2bool, default=True, help='Usage: Default will be set to “True” and will open a new window to show results. Set to “False” to stop visualizations from being shown. Invalid in case of camera input.')
48
  args = parser.parse_args()
49
 
50
  if __name__ == '__main__':
models/handpose_estimation_mediapipe/demo.py CHANGED
@@ -27,7 +27,7 @@ try:
27
  help_msg_backends += "; {:d}: TIMVX"
28
  help_msg_targets += "; {:d}: NPU"
29
  except:
30
- print('This version of OpenCV does not support TIM-VX and NPU. Visit https://gist.github.com/fengyuentau/5a7a5ba36328f2b763aea026c43fa45f for more information.')
31
 
32
  parser = argparse.ArgumentParser(description='Hand Pose Estimation from MediaPipe')
33
  parser.add_argument('--input', '-i', type=str, help='Path to the input image. Omit for using default camera.')
 
27
  help_msg_backends += "; {:d}: TIMVX"
28
  help_msg_targets += "; {:d}: NPU"
29
  except:
30
+ print('This version of OpenCV does not support TIM-VX and NPU. Visit https://github.com/opencv/opencv/wiki/TIM-VX-Backend-For-Running-OpenCV-On-NPU for more information.')
31
 
32
  parser = argparse.ArgumentParser(description='Hand Pose Estimation from MediaPipe')
33
  parser.add_argument('--input', '-i', type=str, help='Path to the input image. Omit for using default camera.')
models/human_segmentation_pphumanseg/README.md CHANGED
@@ -5,14 +5,18 @@ This model is ported from [PaddleHub](https://github.com/PaddlePaddle/PaddleHub)
5
  ## Demo
6
 
7
  Run the following command to try the demo:
 
8
  ```shell
9
  # detect on camera input
10
  python demo.py
11
  # detect on an image
12
  python demo.py --input /path/to/image
 
 
 
13
  ```
14
 
15
- ## Example outputs
16
 
17
  ![webcam demo](./examples/pphumanseg_demo.gif)
18
 
@@ -26,4 +30,4 @@ All files in this directory are licensed under [Apache 2.0 License](./LICENSE).
26
 
27
  - https://arxiv.org/abs/1512.03385
28
  - https://github.com/opencv/opencv/tree/master/samples/dnn/dnn_model_runner/dnn_conversion/paddlepaddle
29
- - https://github.com/PaddlePaddle/PaddleHub
 
5
  ## Demo
6
 
7
  Run the following command to try the demo:
8
+
9
  ```shell
10
  # detect on camera input
11
  python demo.py
12
  # detect on an image
13
  python demo.py --input /path/to/image
14
+
15
+ # get help regarding various parameters
16
+ python demo.py --help
17
  ```
18
 
19
+ ### Example outputs
20
 
21
  ![webcam demo](./examples/pphumanseg_demo.gif)
22
 
 
30
 
31
  - https://arxiv.org/abs/1512.03385
32
  - https://github.com/opencv/opencv/tree/master/samples/dnn/dnn_model_runner/dnn_conversion/paddlepaddle
33
+ - https://github.com/PaddlePaddle/PaddleHub
models/human_segmentation_pphumanseg/demo.py CHANGED
@@ -29,15 +29,15 @@ try:
29
  help_msg_backends += "; {:d}: TIMVX"
30
  help_msg_targets += "; {:d}: NPU"
31
  except:
32
- print('This version of OpenCV does not support TIM-VX and NPU. Visit https://gist.github.com/fengyuentau/5a7a5ba36328f2b763aea026c43fa45f for more information.')
33
 
34
  parser = argparse.ArgumentParser(description='PPHumanSeg (https://github.com/PaddlePaddle/PaddleSeg/tree/release/2.2/contrib/PP-HumanSeg)')
35
- parser.add_argument('--input', '-i', type=str, help='Path to the input image. Omit for using default camera.')
36
- parser.add_argument('--model', '-m', type=str, default='human_segmentation_pphumanseg_2021oct.onnx', help='Path to the model.')
37
  parser.add_argument('--backend', '-b', type=int, default=backends[0], help=help_msg_backends.format(*backends))
38
  parser.add_argument('--target', '-t', type=int, default=targets[0], help=help_msg_targets.format(*targets))
39
- parser.add_argument('--save', '-s', type=str, default=False, help='Set true to save results. This flag is invalid when using camera.')
40
- parser.add_argument('--vis', '-v', type=str2bool, default=True, help='Set true to open a window for result visualization. This flag is invalid when using camera.')
41
  args = parser.parse_args()
42
 
43
  def get_color_map_list(num_classes):
 
29
  help_msg_backends += "; {:d}: TIMVX"
30
  help_msg_targets += "; {:d}: NPU"
31
  except:
32
+ print('This version of OpenCV does not support TIM-VX and NPU. Visit https://github.com/opencv/opencv/wiki/TIM-VX-Backend-For-Running-OpenCV-On-NPU for more information.')
33
 
34
  parser = argparse.ArgumentParser(description='PPHumanSeg (https://github.com/PaddlePaddle/PaddleSeg/tree/release/2.2/contrib/PP-HumanSeg)')
35
+ parser.add_argument('--input', '-i', type=str, help='Usage: Set input path to a certain image, omit if using camera.')
36
+ parser.add_argument('--model', '-m', type=str, default='human_segmentation_pphumanseg_2021oct.onnx', help='Usage: Set model path, defaults to human_segmentation_pphumanseg_2021oct.onnx.')
37
  parser.add_argument('--backend', '-b', type=int, default=backends[0], help=help_msg_backends.format(*backends))
38
  parser.add_argument('--target', '-t', type=int, default=targets[0], help=help_msg_targets.format(*targets))
39
+ parser.add_argument('--save', '-s', type=str, default=False, help='Usage: Set “True” to save a file with results. Invalid in case of camera input. Default will be set to “False”.')
40
+ parser.add_argument('--vis', '-v', type=str2bool, default=True, help='Usage: Default will be set to “True” and will open a new window to show results. Set to “False” to stop visualizations from being shown. Invalid in case of camera input.')
41
  args = parser.parse_args()
42
 
43
  def get_color_map_list(num_classes):
models/image_classification_mobilenet/README.md CHANGED
@@ -6,23 +6,27 @@ MobileNetV2: Inverted Residuals and Linear Bottlenecks
6
 
7
  Results of accuracy evaluation with [tools/eval](../../tools/eval).
8
 
9
- | Models | Top-1 Accuracy | Top-5 Accuracy |
10
- | ------ | -------------- | -------------- |
11
- | MobileNet V1 | 67.64 | 87.97 |
12
- | MobileNet V1 quant | 55.53 | 78.74 |
13
- | MobileNet V2 | 69.44 | 89.23 |
14
- | MobileNet V2 quant | 68.37 | 88.56 |
15
 
16
  \*: 'quant' stands for 'quantized'.
17
 
18
  ## Demo
19
 
20
  Run the following command to try the demo:
 
21
  ```shell
22
  # MobileNet V1
23
  python demo.py --input /path/to/image
24
  # MobileNet V2
25
  python demo.py --input /path/to/image --model v2
 
 
 
26
  ```
27
 
28
  ## License
@@ -35,4 +39,3 @@ All files in this directory are licensed under [Apache 2.0 License](./LICENSE).
35
  - MobileNet V2: https://arxiv.org/abs/1801.04381
36
  - MobileNet V1 weight and scripts for training: https://github.com/wjc852456/pytorch-mobilenet-v1
37
  - MobileNet V2 weight: https://github.com/onnx/models/tree/main/vision/classification/mobilenet
38
-
 
6
 
7
  Results of accuracy evaluation with [tools/eval](../../tools/eval).
8
 
9
+ | Models | Top-1 Accuracy | Top-5 Accuracy |
10
+ | ------------------ | -------------- | -------------- |
11
+ | MobileNet V1 | 67.64 | 87.97 |
12
+ | MobileNet V1 quant | 55.53 | 78.74 |
13
+ | MobileNet V2 | 69.44 | 89.23 |
14
+ | MobileNet V2 quant | 68.37 | 88.56 |
15
 
16
  \*: 'quant' stands for 'quantized'.
17
 
18
  ## Demo
19
 
20
  Run the following command to try the demo:
21
+
22
  ```shell
23
  # MobileNet V1
24
  python demo.py --input /path/to/image
25
  # MobileNet V2
26
  python demo.py --input /path/to/image --model v2
27
+
28
+ # get help regarding various parameters
29
+ python demo.py --help
30
  ```
31
 
32
  ## License
 
39
  - MobileNet V2: https://arxiv.org/abs/1801.04381
40
  - MobileNet V1 weight and scripts for training: https://github.com/wjc852456/pytorch-mobilenet-v1
41
  - MobileNet V2 weight: https://github.com/onnx/models/tree/main/vision/classification/mobilenet
 
models/image_classification_mobilenet/demo.py CHANGED
@@ -24,14 +24,14 @@ try:
24
  help_msg_backends += "; {:d}: TIMVX"
25
  help_msg_targets += "; {:d}: NPU"
26
  except:
27
- print('This version of OpenCV does not support TIM-VX and NPU. Visit https://gist.github.com/fengyuentau/5a7a5ba36328f2b763aea026c43fa45f for more information.')
28
 
29
  parser = argparse.ArgumentParser(description='Demo for MobileNet V1 & V2.')
30
- parser.add_argument('--input', '-i', type=str, help='Path to the input image.')
31
- parser.add_argument('--model', '-m', type=str, choices=['v1', 'v2', 'v1-q', 'v2-q'], default='v1', help='Which model to use, either v1 or v2.')
32
  parser.add_argument('--backend', '-b', type=int, default=backends[0], help=help_msg_backends.format(*backends))
33
  parser.add_argument('--target', '-t', type=int, default=targets[0], help=help_msg_targets.format(*targets))
34
- parser.add_argument('--label', '-l', type=str, default='./imagenet_labels.txt', help='Path to the dataset labels.')
35
  args = parser.parse_args()
36
 
37
  if __name__ == '__main__':
 
24
  help_msg_backends += "; {:d}: TIMVX"
25
  help_msg_targets += "; {:d}: NPU"
26
  except:
27
+ print('This version of OpenCV does not support TIM-VX and NPU. Visit https://github.com/opencv/opencv/wiki/TIM-VX-Backend-For-Running-OpenCV-On-NPU for more information.')
28
 
29
  parser = argparse.ArgumentParser(description='Demo for MobileNet V1 & V2.')
30
+ parser.add_argument('--input', '-i', type=str, help='Usage: Set input path to a certain image, omit if using camera.')
31
+ parser.add_argument('--model', '-m', type=str, choices=['v1', 'v2', 'v1-q', 'v2-q'], default='v1', help='Usage: Set model type, defaults to image_classification_mobilenetv1_2022apr.onnx (v1).')
32
  parser.add_argument('--backend', '-b', type=int, default=backends[0], help=help_msg_backends.format(*backends))
33
  parser.add_argument('--target', '-t', type=int, default=targets[0], help=help_msg_targets.format(*targets))
34
+ parser.add_argument('--label', '-l', type=str, default='./imagenet_labels.txt', help='Usage: Set path to the different labels that will be used during the detection. Default list found in imagenet_labels.txt')
35
  args = parser.parse_args()
36
 
37
  if __name__ == '__main__':
models/image_classification_ppresnet/README.md CHANGED
@@ -6,18 +6,22 @@ This model is ported from [PaddleHub](https://github.com/PaddlePaddle/PaddleHub)
6
 
7
  Results of accuracy evaluation with [tools/eval](../../tools/eval).
8
 
9
- | Models | Top-1 Accuracy | Top-5 Accuracy |
10
- | ------ | -------------- | -------------- |
11
- | PP-ResNet | 82.28 | 96.15 |
12
- | PP-ResNet quant | 0.22 | 0.96 |
13
 
14
  \*: 'quant' stands for 'quantized'.
15
 
16
  ## Demo
17
 
18
  Run the following command to try the demo:
 
19
  ```shell
20
  python demo.py --input /path/to/image
 
 
 
21
  ```
22
 
23
  ## License
@@ -29,4 +33,3 @@ All files in this directory are licensed under [Apache 2.0 License](./LICENSE).
29
  - https://arxiv.org/abs/1512.03385
30
  - https://github.com/opencv/opencv/tree/master/samples/dnn/dnn_model_runner/dnn_conversion/paddlepaddle
31
  - https://github.com/PaddlePaddle/PaddleHub
32
-
 
6
 
7
  Results of accuracy evaluation with [tools/eval](../../tools/eval).
8
 
9
+ | Models | Top-1 Accuracy | Top-5 Accuracy |
10
+ | --------------- | -------------- | -------------- |
11
+ | PP-ResNet | 82.28 | 96.15 |
12
+ | PP-ResNet quant | 0.22 | 0.96 |
13
 
14
  \*: 'quant' stands for 'quantized'.
15
 
16
  ## Demo
17
 
18
  Run the following command to try the demo:
19
+
20
  ```shell
21
  python demo.py --input /path/to/image
22
+
23
+ # get help regarding various parameters
24
+ python demo.py --help
25
  ```
26
 
27
  ## License
 
33
  - https://arxiv.org/abs/1512.03385
34
  - https://github.com/opencv/opencv/tree/master/samples/dnn/dnn_model_runner/dnn_conversion/paddlepaddle
35
  - https://github.com/PaddlePaddle/PaddleHub
 
models/image_classification_ppresnet/demo.py CHANGED
@@ -29,14 +29,14 @@ try:
29
  help_msg_backends += "; {:d}: TIMVX"
30
  help_msg_targets += "; {:d}: NPU"
31
  except:
32
- print('This version of OpenCV does not support TIM-VX and NPU. Visit https://gist.github.com/fengyuentau/5a7a5ba36328f2b763aea026c43fa45f for more information.')
33
 
34
  parser = argparse.ArgumentParser(description='Deep Residual Learning for Image Recognition (https://arxiv.org/abs/1512.03385, https://github.com/PaddlePaddle/PaddleHub)')
35
- parser.add_argument('--input', '-i', type=str, help='Path to the input image.')
36
- parser.add_argument('--model', '-m', type=str, default='image_classification_ppresnet50_2022jan.onnx', help='Path to the model.')
37
  parser.add_argument('--backend', '-b', type=int, default=backends[0], help=help_msg_backends.format(*backends))
38
  parser.add_argument('--target', '-t', type=int, default=targets[0], help=help_msg_targets.format(*targets))
39
- parser.add_argument('--label', '-l', type=str, default='./imagenet_labels.txt', help='Path to the dataset labels.')
40
  args = parser.parse_args()
41
 
42
  if __name__ == '__main__':
 
29
  help_msg_backends += "; {:d}: TIMVX"
30
  help_msg_targets += "; {:d}: NPU"
31
  except:
32
+ print('This version of OpenCV does not support TIM-VX and NPU. Visit https://github.com/opencv/opencv/wiki/TIM-VX-Backend-For-Running-OpenCV-On-NPU for more information.')
33
 
34
  parser = argparse.ArgumentParser(description='Deep Residual Learning for Image Recognition (https://arxiv.org/abs/1512.03385, https://github.com/PaddlePaddle/PaddleHub)')
35
+ parser.add_argument('--input', '-i', type=str, help='Usage: Set input path to a certain image, omit if using camera.')
36
+ parser.add_argument('--model', '-m', type=str, default='image_classification_ppresnet50_2022jan.onnx', help='Usage: Set model path, defaults to image_classification_ppresnet50_2022jan.onnx.')
37
  parser.add_argument('--backend', '-b', type=int, default=backends[0], help=help_msg_backends.format(*backends))
38
  parser.add_argument('--target', '-t', type=int, default=targets[0], help=help_msg_targets.format(*targets))
39
+ parser.add_argument('--label', '-l', type=str, default='./imagenet_labels.txt', help='Usage: Set path to the different labels that will be used during the detection. Default list found in imagenet_labels.txt')
40
  args = parser.parse_args()
41
 
42
  if __name__ == '__main__':
models/license_plate_detection_yunet/README.md CHANGED
@@ -7,11 +7,14 @@ Please note that the model is trained with Chinese license plates, so the detect
7
  ## Demo
8
 
9
  Run the following command to try the demo:
 
10
  ```shell
11
  # detect on camera input
12
  python demo.py
13
  # detect on an image
14
  python demo.py --input /path/to/image
 
 
15
  ```
16
 
17
  ### Example outputs
@@ -19,8 +22,9 @@ python demo.py --input /path/to/image
19
  ![lpd](./examples/lpd_yunet_demo.gif)
20
 
21
  ## License
 
22
  All files in this directory are licensed under [Apache 2.0 License](./LICENSE)
23
 
24
  ## Reference
25
 
26
- - https://github.com/ShiqiYu/libfacedetection.train
 
7
  ## Demo
8
 
9
  Run the following command to try the demo:
10
+
11
  ```shell
12
  # detect on camera input
13
  python demo.py
14
  # detect on an image
15
  python demo.py --input /path/to/image
16
+ # get help regarding various parameters
17
+ python demo.py --help
18
  ```
19
 
20
  ### Example outputs
 
22
  ![lpd](./examples/lpd_yunet_demo.gif)
23
 
24
  ## License
25
+
26
  All files in this directory are licensed under [Apache 2.0 License](./LICENSE)
27
 
28
  ## Reference
29
 
30
+ - https://github.com/ShiqiYu/libfacedetection.train
models/license_plate_detection_yunet/demo.py CHANGED
@@ -23,19 +23,19 @@ try:
23
  help_msg_backends += "; {:d}: TIMVX"
24
  help_msg_targets += "; {:d}: NPU"
25
  except:
26
- print('This version of OpenCV does not support TIM-VX and NPU. Visit https://gist.github.com/fengyuentau/5a7a5ba36328f2b763aea026c43fa45f for more information.')
27
 
28
  parser = argparse.ArgumentParser(description='LPD-YuNet for License Plate Detection')
29
- parser.add_argument('--input', '-i', type=str, help='Path to the input image. Omit for using default camera.')
30
- parser.add_argument('--model', '-m', type=str, default='license_plate_detection_lpd_yunet_2022may.onnx', help='Path to the model.')
31
  parser.add_argument('--backend', '-b', type=int, default=backends[0], help=help_msg_backends.format(*backends))
32
  parser.add_argument('--target', '-t', type=int, default=targets[0], help=help_msg_targets.format(*targets))
33
- parser.add_argument('--conf_threshold', type=float, default=0.9, help='Filter out faces of confidence < conf_threshold.')
34
- parser.add_argument('--nms_threshold', type=float, default=0.3, help='Suppress bounding boxes of iou >= nms_threshold.')
35
- parser.add_argument('--top_k', type=int, default=5000, help='Keep top_k bounding boxes before NMS.')
36
- parser.add_argument('--keep_top_k', type=int, default=750, help='Keep keep_top_k bounding boxes after NMS.')
37
- parser.add_argument('--save', '-s', type=str2bool, default=False, help='Set true to save results. This flag is invalid when using camera.')
38
- parser.add_argument('--vis', '-v', type=str2bool, default=True, help='Set true to open a window for result visualization. This flag is invalid when using camera.')
39
  args = parser.parse_args()
40
 
41
  def visualize(image, dets, line_color=(0, 255, 0), text_color=(0, 0, 255), fps=None):
 
23
  help_msg_backends += "; {:d}: TIMVX"
24
  help_msg_targets += "; {:d}: NPU"
25
  except:
26
+ print('This version of OpenCV does not support TIM-VX and NPU. Visit https://github.com/opencv/opencv/wiki/TIM-VX-Backend-For-Running-OpenCV-On-NPU for more information.')
27
 
28
  parser = argparse.ArgumentParser(description='LPD-YuNet for License Plate Detection')
29
+ parser.add_argument('--input', '-i', type=str, help='Usage: Set path to the input image. Omit for using default camera.')
30
+ parser.add_argument('--model', '-m', type=str, default='license_plate_detection_lpd_yunet_2022may.onnx', help='Usage: Set model path, defaults to license_plate_detection_lpd_yunet_2022may.onnx.')
31
  parser.add_argument('--backend', '-b', type=int, default=backends[0], help=help_msg_backends.format(*backends))
32
  parser.add_argument('--target', '-t', type=int, default=targets[0], help=help_msg_targets.format(*targets))
33
+ parser.add_argument('--conf_threshold', type=float, default=0.9, help='Usage: Set the minimum needed confidence for the model to identify a license plate, defaults to 0.9. Smaller values may result in faster detection, but will limit accuracy. Filter out faces of confidence < conf_threshold.')
34
+ parser.add_argument('--nms_threshold', type=float, default=0.3, help='Usage: Suppress bounding boxes of iou >= nms_threshold. Default = 0.3. Suppress bounding boxes of iou >= nms_threshold.')
35
+ parser.add_argument('--top_k', type=int, default=5000, help='Usage: Keep top_k bounding boxes before NMS.')
36
+ parser.add_argument('--keep_top_k', type=int, default=750, help='Usage: Keep keep_top_k bounding boxes after NMS.')
37
+ parser.add_argument('--save', '-s', type=str2bool, default=False, help='Usage: Set “True” to save file with results (i.e. bounding box, confidence level). Invalid in case of camera input. Default will be set to “False”.')
38
+ parser.add_argument('--vis', '-v', type=str2bool, default=True, help='Usage: Default will be set to “True” and will open a new window to show results. Set to “False” to stop visualizations from being shown. Invalid in case of camera input.')
39
  args = parser.parse_args()
40
 
41
  def visualize(image, dets, line_color=(0, 255, 0), text_color=(0, 0, 255), fps=None):
models/object_tracking_dasiamrpn/README.md CHANGED
@@ -3,17 +3,22 @@
3
  [Distractor-aware Siamese Networks for Visual Object Tracking](https://arxiv.org/abs/1808.06048)
4
 
5
  Note:
 
6
  - Model source: [opencv/samples/dnn/diasiamrpn_tracker.cpp](https://github.com/opencv/opencv/blob/ceb94d52a104c0c1287a43dfa6ba72705fb78ac1/samples/dnn/dasiamrpn_tracker.cpp#L5-L7)
7
  - Visit https://github.com/foolwood/DaSiamRPN for training details.
8
 
9
  ## Demo
10
 
11
  Run the following command to try the demo:
 
12
  ```shell
13
  # track on camera input
14
  python demo.py
15
  # track on video input
16
  python demo.py --input /path/to/video
 
 
 
17
  ```
18
 
19
  ### Example outputs
@@ -29,4 +34,4 @@ All files in this directory are licensed under [Apache 2.0 License](./LICENSE).
29
  - DaSiamRPN Official Repository: https://github.com/foolwood/DaSiamRPN
30
  - Paper: https://arxiv.org/abs/1808.06048
31
  - OpenCV API `TrackerDaSiamRPN` Doc: https://docs.opencv.org/4.x/de/d93/classcv_1_1TrackerDaSiamRPN.html
32
- - OpenCV Sample: https://github.com/opencv/opencv/blob/4.x/samples/dnn/dasiamrpn_tracker.cpp
 
3
  [Distractor-aware Siamese Networks for Visual Object Tracking](https://arxiv.org/abs/1808.06048)
4
 
5
  Note:
6
+
7
  - Model source: [opencv/samples/dnn/diasiamrpn_tracker.cpp](https://github.com/opencv/opencv/blob/ceb94d52a104c0c1287a43dfa6ba72705fb78ac1/samples/dnn/dasiamrpn_tracker.cpp#L5-L7)
8
  - Visit https://github.com/foolwood/DaSiamRPN for training details.
9
 
10
  ## Demo
11
 
12
  Run the following command to try the demo:
13
+
14
  ```shell
15
  # track on camera input
16
  python demo.py
17
  # track on video input
18
  python demo.py --input /path/to/video
19
+
20
+ # get help regarding various parameters
21
+ python demo.py --help
22
  ```
23
 
24
  ### Example outputs
 
34
  - DaSiamRPN Official Repository: https://github.com/foolwood/DaSiamRPN
35
  - Paper: https://arxiv.org/abs/1808.06048
36
  - OpenCV API `TrackerDaSiamRPN` Doc: https://docs.opencv.org/4.x/de/d93/classcv_1_1TrackerDaSiamRPN.html
37
+ - OpenCV Sample: https://github.com/opencv/opencv/blob/4.x/samples/dnn/dasiamrpn_tracker.cpp
models/object_tracking_dasiamrpn/demo.py CHANGED
@@ -21,12 +21,12 @@ def str2bool(v):
21
 
22
  parser = argparse.ArgumentParser(
23
  description="Distractor-aware Siamese Networks for Visual Object Tracking (https://arxiv.org/abs/1808.06048)")
24
- parser.add_argument('--input', '-i', type=str, help='Path to the input video. Omit for using default camera.')
25
- parser.add_argument('--model_path', type=str, default='object_tracking_dasiamrpn_model_2021nov.onnx', help='Path to dasiamrpn_model.onnx.')
26
- parser.add_argument('--kernel_cls1_path', type=str, default='object_tracking_dasiamrpn_kernel_cls1_2021nov.onnx', help='Path to dasiamrpn_kernel_cls1.onnx.')
27
- parser.add_argument('--kernel_r1_path', type=str, default='object_tracking_dasiamrpn_kernel_r1_2021nov.onnx', help='Path to dasiamrpn_kernel_r1.onnx.')
28
- parser.add_argument('--save', '-s', type=str2bool, default=False, help='Set true to save results. This flag is invalid when using camera.')
29
- parser.add_argument('--vis', '-v', type=str2bool, default=True, help='Set true to open a window for result visualization. This flag is invalid when using camera.')
30
  args = parser.parse_args()
31
 
32
  def visualize(image, bbox, score, isLocated, fps=None, box_color=(0, 255, 0),text_color=(0, 255, 0), fontScale = 1, fontSize = 1):
 
21
 
22
  parser = argparse.ArgumentParser(
23
  description="Distractor-aware Siamese Networks for Visual Object Tracking (https://arxiv.org/abs/1808.06048)")
24
+ parser.add_argument('--input', '-i', type=str, help='Usage: Set path to the input video. Omit for using default camera.')
25
+ parser.add_argument('--model_path', type=str, default='object_tracking_dasiamrpn_model_2021nov.onnx', help='Usage: Set model path, defaults to object_tracking_dasiamrpn_model_2021nov.onnx.')
26
+ parser.add_argument('--kernel_cls1_path', type=str, default='object_tracking_dasiamrpn_kernel_cls1_2021nov.onnx', help='Usage: Set path to dasiamrpn_kernel_cls1.onnx.')
27
+ parser.add_argument('--kernel_r1_path', type=str, default='object_tracking_dasiamrpn_kernel_r1_2021nov.onnx', help='Usage: Set path to dasiamrpn_kernel_r1.onnx.')
28
+ parser.add_argument('--save', '-s', type=str2bool, default=False, help='Usage: Set “True” to save a file with results. Invalid in case of camera input. Default will be set to “False”.')
29
+ parser.add_argument('--vis', '-v', type=str2bool, default=True, help='Usage: Default will be set to “True” and will open a new window to show results. Set to “False” to stop visualizations from being shown. Invalid in case of camera input.')
30
  args = parser.parse_args()
31
 
32
  def visualize(image, bbox, score, isLocated, fps=None, box_color=(0, 255, 0),text_color=(0, 255, 0), fontScale = 1, fontSize = 1):
models/palm_detection_mediapipe/README.md CHANGED
@@ -1,20 +1,25 @@
1
  # Palm detector from MediaPipe Handpose
2
 
3
  This model detects palm bounding boxes and palm landmarks, and is converted from Tensorflow-JS to ONNX using following tools:
 
4
  - tfjs to tf_saved_model: https://github.com/patlevin/tfjs-to-tf/
5
  - tf_saved_model to ONNX: https://github.com/onnx/tensorflow-onnx
6
  - simplified by [onnx-simplifier](https://github.com/daquexian/onnx-simplifier)
7
 
8
- Also note that the model is quantized in per-channel mode with [Intel's neural compressor](https://github.com/intel/neural-compressor), which gives better accuracy but may lose some speed.
9
 
10
  ## Demo
11
 
12
  Run the following commands to try the demo:
 
13
  ```bash
14
  # detect on camera input
15
  python demo.py
16
  # detect on an image
17
  python demo.py -i /path/to/image
 
 
 
18
  ```
19
 
20
  ### Example outputs
 
1
  # Palm detector from MediaPipe Handpose
2
 
3
  This model detects palm bounding boxes and palm landmarks, and is converted from Tensorflow-JS to ONNX using following tools:
4
+
5
  - tfjs to tf_saved_model: https://github.com/patlevin/tfjs-to-tf/
6
  - tf_saved_model to ONNX: https://github.com/onnx/tensorflow-onnx
7
  - simplified by [onnx-simplifier](https://github.com/daquexian/onnx-simplifier)
8
 
9
+ Also note that the model is quantized in per-channel mode with [Intel&#39;s neural compressor](https://github.com/intel/neural-compressor), which gives better accuracy but may lose some speed.
10
 
11
  ## Demo
12
 
13
  Run the following commands to try the demo:
14
+
15
  ```bash
16
  # detect on camera input
17
  python demo.py
18
  # detect on an image
19
  python demo.py -i /path/to/image
20
+
21
+ # get help regarding various parameters
22
+ python demo.py --help
23
  ```
24
 
25
  ### Example outputs
models/palm_detection_mediapipe/demo.py CHANGED
@@ -23,17 +23,17 @@ try:
23
  help_msg_backends += "; {:d}: TIMVX"
24
  help_msg_targets += "; {:d}: NPU"
25
  except:
26
- print('This version of OpenCV does not support TIM-VX and NPU. Visit https://gist.github.com/fengyuentau/5a7a5ba36328f2b763aea026c43fa45f for more information.')
27
 
28
  parser = argparse.ArgumentParser(description='Hand Detector from MediaPipe')
29
- parser.add_argument('--input', '-i', type=str, help='Path to the input image. Omit for using default camera.')
30
- parser.add_argument('--model', '-m', type=str, default='./palm_detection_mediapipe_2022may.onnx', help='Path to the model.')
31
  parser.add_argument('--backend', '-b', type=int, default=backends[0], help=help_msg_backends.format(*backends))
32
  parser.add_argument('--target', '-t', type=int, default=targets[0], help=help_msg_targets.format(*targets))
33
- parser.add_argument('--score_threshold', type=float, default=0.99, help='Filter out faces of confidence < conf_threshold. An empirical score threshold for the quantized model is 0.49.')
34
- parser.add_argument('--nms_threshold', type=float, default=0.3, help='Suppress bounding boxes of iou >= nms_threshold.')
35
- parser.add_argument('--save', '-s', type=str, default=False, help='Set true to save results. This flag is invalid when using camera.')
36
- parser.add_argument('--vis', '-v', type=str2bool, default=True, help='Set true to open a window for result visualization. This flag is invalid when using camera.')
37
  args = parser.parse_args()
38
 
39
  def visualize(image, results, print_results=False, fps=None):
 
23
  help_msg_backends += "; {:d}: TIMVX"
24
  help_msg_targets += "; {:d}: NPU"
25
  except:
26
+ print('This version of OpenCV does not support TIM-VX and NPU. Visit https://github.com/opencv/opencv/wiki/TIM-VX-Backend-For-Running-OpenCV-On-NPU for more information.')
27
 
28
  parser = argparse.ArgumentParser(description='Hand Detector from MediaPipe')
29
+ parser.add_argument('--input', '-i', type=str, help='Usage: Set path to the input image. Omit for using default camera.')
30
+ parser.add_argument('--model', '-m', type=str, default='./palm_detection_mediapipe_2022may.onnx', help='Usage: Set model path, defaults to palm_detection_mediapipe_2022may.onnx.')
31
  parser.add_argument('--backend', '-b', type=int, default=backends[0], help=help_msg_backends.format(*backends))
32
  parser.add_argument('--target', '-t', type=int, default=targets[0], help=help_msg_targets.format(*targets))
33
+ parser.add_argument('--score_threshold', type=float, default=0.99, help='Usage: Set the minimum needed confidence for the model to identify a palm, defaults to 0.99. Smaller values may result in faster detection, but will limit accuracy. Filter out faces of confidence < conf_threshold. An empirical score threshold for the quantized model is 0.49.')
34
+ parser.add_argument('--nms_threshold', type=float, default=0.3, help='Usage: Suppress bounding boxes of iou >= nms_threshold. Default = 0.3.')
35
+ parser.add_argument('--save', '-s', type=str, default=False, help='Usage: Set “True” to save file with results (i.e. bounding box, confidence level). Invalid in case of camera input. Default will be set to “False”.')
36
+ parser.add_argument('--vis', '-v', type=str2bool, default=True, help='Usage: Default will be set to “True” and will open a new window to show results. Set to “False” to stop visualizations from being shown. Invalid in case of camera input.')
37
  args = parser.parse_args()
38
 
39
  def visualize(image, results, print_results=False, fps=None):
models/person_reid_youtureid/README.md CHANGED
@@ -3,20 +3,25 @@
3
  This model is provided by Tencent Youtu Lab [[Credits]](https://github.com/opencv/opencv/blob/394e640909d5d8edf9c1f578f8216d513373698c/samples/dnn/person_reid.py#L6-L11).
4
 
5
  Note:
 
6
  - Model source: https://github.com/ReID-Team/ReID_extra_testdata
7
 
8
  ## Demo
9
 
10
  Run the following command to try the demo:
 
11
  ```shell
12
- python demo.py --input1 /path/to/person1 --input2 /path/to/person2
 
 
 
13
  ```
14
 
15
- ## License
16
 
17
  All files in this directory are licensed under [Apache 2.0 License](./LICENSE).
18
 
19
  ## Reference:
20
 
21
  - OpenCV DNN Sample: https://github.com/opencv/opencv/blob/4.x/samples/dnn/person_reid.py
22
- - Model source: https://github.com/ReID-Team/ReID_extra_testdata
 
3
  This model is provided by Tencent Youtu Lab [[Credits]](https://github.com/opencv/opencv/blob/394e640909d5d8edf9c1f578f8216d513373698c/samples/dnn/person_reid.py#L6-L11).
4
 
5
  Note:
6
+
7
  - Model source: https://github.com/ReID-Team/ReID_extra_testdata
8
 
9
  ## Demo
10
 
11
  Run the following command to try the demo:
12
+
13
  ```shell
14
+ python demo.py --query_dir /path/to/query --gallery_dir /path/to/gallery
15
+
16
+ # get help regarding various parameters
17
+ python demo.py --help
18
  ```
19
 
20
+ ### License
21
 
22
  All files in this directory are licensed under [Apache 2.0 License](./LICENSE).
23
 
24
  ## Reference:
25
 
26
  - OpenCV DNN Sample: https://github.com/opencv/opencv/blob/4.x/samples/dnn/person_reid.py
27
+ - Model source: https://github.com/ReID-Team/ReID_extra_testdata
models/person_reid_youtureid/demo.py CHANGED
@@ -30,7 +30,7 @@ try:
30
  help_msg_backends += "; {:d}: TIMVX"
31
  help_msg_targets += "; {:d}: NPU"
32
  except:
33
- print('This version of OpenCV does not support TIM-VX and NPU. Visit https://gist.github.com/fengyuentau/5a7a5ba36328f2b763aea026c43fa45f for more information.')
34
 
35
  parser = argparse.ArgumentParser(
36
  description="ReID baseline models from Tencent Youtu Lab")
 
30
  help_msg_backends += "; {:d}: TIMVX"
31
  help_msg_targets += "; {:d}: NPU"
32
  except:
33
+ print('This version of OpenCV does not support TIM-VX and NPU. Visit https://github.com/opencv/opencv/wiki/TIM-VX-Backend-For-Running-OpenCV-On-NPU for more information.')
34
 
35
  parser = argparse.ArgumentParser(
36
  description="ReID baseline models from Tencent Youtu Lab")
models/qrcode_wechatqrcode/README.md CHANGED
@@ -3,17 +3,22 @@
3
  WeChatQRCode for detecting and parsing QR Code, contributed by [WeChat Computer Vision Team (WeChatCV)](https://github.com/WeChatCV). Visit [opencv/opencv_contrib/modules/wechat_qrcode](https://github.com/opencv/opencv_contrib/tree/master/modules/wechat_qrcode) for more details.
4
 
5
  Notes:
 
6
  - Model source: [opencv/opencv_3rdparty:wechat_qrcode_20210119](https://github.com/opencv/opencv_3rdparty/tree/wechat_qrcode_20210119)
7
  - The APIs `cv::wechat_qrcode::WeChatQRCode` (C++) & `cv.wechat_qrcode_WeChatQRCode` (Python) are both designed to run on default backend (OpenCV) and target (CPU) only. Therefore, benchmark results of this model are only available on CPU devices, until the APIs are updated with setting backends and targets.
8
 
9
  ## Demo
10
 
11
  Run the following command to try the demo:
 
12
  ```shell
13
  # detect on camera input
14
  python demo.py
15
  # detect on an image
16
  python demo.py --input /path/to/image
 
 
 
17
  ```
18
 
19
  ### Example outputs
@@ -27,4 +32,4 @@ All files in this directory are licensed under [Apache 2.0 License](./LICENSE).
27
  ## Reference:
28
 
29
  - https://github.com/opencv/opencv_contrib/tree/master/modules/wechat_qrcode
30
- - https://github.com/opencv/opencv_3rdparty/tree/wechat_qrcode_20210119
 
3
  WeChatQRCode for detecting and parsing QR Code, contributed by [WeChat Computer Vision Team (WeChatCV)](https://github.com/WeChatCV). Visit [opencv/opencv_contrib/modules/wechat_qrcode](https://github.com/opencv/opencv_contrib/tree/master/modules/wechat_qrcode) for more details.
4
 
5
  Notes:
6
+
7
  - Model source: [opencv/opencv_3rdparty:wechat_qrcode_20210119](https://github.com/opencv/opencv_3rdparty/tree/wechat_qrcode_20210119)
8
  - The APIs `cv::wechat_qrcode::WeChatQRCode` (C++) & `cv.wechat_qrcode_WeChatQRCode` (Python) are both designed to run on default backend (OpenCV) and target (CPU) only. Therefore, benchmark results of this model are only available on CPU devices, until the APIs are updated with setting backends and targets.
9
 
10
  ## Demo
11
 
12
  Run the following command to try the demo:
13
+
14
  ```shell
15
  # detect on camera input
16
  python demo.py
17
  # detect on an image
18
  python demo.py --input /path/to/image
19
+
20
+ # get help regarding various parameters
21
+ python demo.py --help
22
  ```
23
 
24
  ### Example outputs
 
32
  ## Reference:
33
 
34
  - https://github.com/opencv/opencv_contrib/tree/master/modules/wechat_qrcode
35
+ - https://github.com/opencv/opencv_3rdparty/tree/wechat_qrcode_20210119
models/qrcode_wechatqrcode/demo.py CHANGED
@@ -21,13 +21,13 @@ def str2bool(v):
21
 
22
  parser = argparse.ArgumentParser(
23
  description="WeChat QR code detector for detecting and parsing QR code (https://github.com/opencv/opencv_contrib/tree/master/modules/wechat_qrcode)")
24
- parser.add_argument('--input', '-i', type=str, help='Path to the input image. Omit for using default camera.')
25
- parser.add_argument('--detect_prototxt_path', type=str, default='detect_2021sep.prototxt', help='Path to detect.prototxt.')
26
- parser.add_argument('--detect_model_path', type=str, default='detect_2021sep.caffemodel', help='Path to detect.caffemodel.')
27
- parser.add_argument('--sr_prototxt_path', type=str, default='sr_2021sep.prototxt', help='Path to sr.prototxt.')
28
- parser.add_argument('--sr_model_path', type=str, default='sr_2021sep.caffemodel', help='Path to sr.caffemodel.')
29
- parser.add_argument('--save', '-s', type=str2bool, default=False, help='Set true to save results. This flag is invalid when using camera.')
30
- parser.add_argument('--vis', '-v', type=str2bool, default=True, help='Set true to open a window for result visualization. This flag is invalid when using camera.')
31
  args = parser.parse_args()
32
 
33
  def visualize(image, res, points, points_color=(0, 255, 0), text_color=(0, 255, 0), fps=None):
 
21
 
22
  parser = argparse.ArgumentParser(
23
  description="WeChat QR code detector for detecting and parsing QR code (https://github.com/opencv/opencv_contrib/tree/master/modules/wechat_qrcode)")
24
+ parser.add_argument('--input', '-i', type=str, help='Usage: Set path to the input image. Omit for using default camera.')
25
+ parser.add_argument('--detect_prototxt_path', type=str, default='detect_2021sep.prototxt', help='Usage: Set path to detect.prototxt.')
26
+ parser.add_argument('--detect_model_path', type=str, default='detect_2021sep.caffemodel', help='Usage: Set path to detect.caffemodel.')
27
+ parser.add_argument('--sr_prototxt_path', type=str, default='sr_2021sep.prototxt', help='Usage: Set path to sr.prototxt.')
28
+ parser.add_argument('--sr_model_path', type=str, default='sr_2021sep.caffemodel', help='Usage: Set path to sr.caffemodel.')
29
+ parser.add_argument('--save', '-s', type=str2bool, default=False, help='Usage: Set “True” to save file with results (i.e. bounding box, confidence level). Invalid in case of camera input. Default will be set to “False”.')
30
+ parser.add_argument('--vis', '-v', type=str2bool, default=True, help='Usage: Default will be set to “True” and will open a new window to show results. Set to “False” to stop visualizations from being shown. Invalid in case of camera input.')
31
  args = parser.parse_args()
32
 
33
  def visualize(image, res, points, points_color=(0, 255, 0), text_color=(0, 255, 0), fps=None):
models/text_detection_db/README.md CHANGED
@@ -3,6 +3,7 @@
3
  Real-time Scene Text Detection with Differentiable Binarization
4
 
5
  Note:
 
6
  - Models source: [here](https://drive.google.com/drive/folders/1qzNCHfUJOS0NEUOIKn69eCtxdlNPpWbq).
7
  - `IC15` in the filename means the model is trained on [IC15 dataset](https://rrc.cvc.uab.es/?ch=4&com=introduction), which can detect English text instances only.
8
  - `TD500` in the filename means the model is trained on [TD500 dataset](http://www.iapr-tc11.org/mediawiki/index.php/MSRA_Text_Detection_500_Database_(MSRA-TD500)), which can detect both English & Chinese instances.
@@ -11,12 +12,17 @@ Note:
11
  ## Demo
12
 
13
  Run the following command to try the demo:
 
14
  ```shell
15
  # detect on camera input
16
  python demo.py
17
  # detect on an image
18
  python demo.py --input /path/to/image
 
 
 
19
  ```
 
20
  ### Example outputs
21
 
22
  ![mask](./examples/mask.jpg)
@@ -31,4 +37,4 @@ All files in this directory are licensed under [Apache 2.0 License](./LICENSE).
31
 
32
  - https://arxiv.org/abs/1911.08947
33
  - https://github.com/MhLiao/DB
34
- - https://docs.opencv.org/master/d4/d43/tutorial_dnn_text_spotting.html
 
3
  Real-time Scene Text Detection with Differentiable Binarization
4
 
5
  Note:
6
+
7
  - Models source: [here](https://drive.google.com/drive/folders/1qzNCHfUJOS0NEUOIKn69eCtxdlNPpWbq).
8
  - `IC15` in the filename means the model is trained on [IC15 dataset](https://rrc.cvc.uab.es/?ch=4&com=introduction), which can detect English text instances only.
9
  - `TD500` in the filename means the model is trained on [TD500 dataset](http://www.iapr-tc11.org/mediawiki/index.php/MSRA_Text_Detection_500_Database_(MSRA-TD500)), which can detect both English & Chinese instances.
 
12
  ## Demo
13
 
14
  Run the following command to try the demo:
15
+
16
  ```shell
17
  # detect on camera input
18
  python demo.py
19
  # detect on an image
20
  python demo.py --input /path/to/image
21
+
22
+ # get help regarding various parameters
23
+ python demo.py --help
24
  ```
25
+
26
  ### Example outputs
27
 
28
  ![mask](./examples/mask.jpg)
 
37
 
38
  - https://arxiv.org/abs/1911.08947
39
  - https://github.com/MhLiao/DB
40
+ - https://docs.opencv.org/master/d4/d43/tutorial_dnn_text_spotting.html
models/text_detection_db/demo.py CHANGED
@@ -29,23 +29,23 @@ try:
29
  help_msg_backends += "; {:d}: TIMVX"
30
  help_msg_targets += "; {:d}: NPU"
31
  except:
32
- print('This version of OpenCV does not support TIM-VX and NPU. Visit https://gist.github.com/fengyuentau/5a7a5ba36328f2b763aea026c43fa45f for more information.')
33
 
34
  parser = argparse.ArgumentParser(description='Real-time Scene Text Detection with Differentiable Binarization (https://arxiv.org/abs/1911.08947).')
35
- parser.add_argument('--input', '-i', type=str, help='Path to the input image. Omit for using default camera.')
36
- parser.add_argument('--model', '-m', type=str, default='text_detection_DB_TD500_resnet18_2021sep.onnx', help='Path to the model.')
37
  parser.add_argument('--backend', '-b', type=int, default=backends[0], help=help_msg_backends.format(*backends))
38
  parser.add_argument('--target', '-t', type=int, default=targets[0], help=help_msg_targets.format(*targets))
39
  parser.add_argument('--width', type=int, default=736,
40
- help='Preprocess input image by resizing to a specific width. It should be multiple by 32.')
41
  parser.add_argument('--height', type=int, default=736,
42
- help='Preprocess input image by resizing to a specific height. It should be multiple by 32.')
43
- parser.add_argument('--binary_threshold', type=float, default=0.3, help='Threshold of the binary map.')
44
- parser.add_argument('--polygon_threshold', type=float, default=0.5, help='Threshold of polygons.')
45
- parser.add_argument('--max_candidates', type=int, default=200, help='Max candidates of polygons.')
46
- parser.add_argument('--unclip_ratio', type=np.float64, default=2.0, help=' The unclip ratio of the detected text region, which determines the output size.')
47
- parser.add_argument('--save', '-s', type=str, default=False, help='Set true to save results. This flag is invalid when using camera.')
48
- parser.add_argument('--vis', '-v', type=str2bool, default=True, help='Set true to open a window for result visualization. This flag is invalid when using camera.')
49
  args = parser.parse_args()
50
 
51
  def visualize(image, results, box_color=(0, 255, 0), text_color=(0, 0, 255), isClosed=True, thickness=2, fps=None):
 
29
  help_msg_backends += "; {:d}: TIMVX"
30
  help_msg_targets += "; {:d}: NPU"
31
  except:
32
+ print('This version of OpenCV does not support TIM-VX and NPU. Visit https://github.com/opencv/opencv/wiki/TIM-VX-Backend-For-Running-OpenCV-On-NPU for more information.')
33
 
34
  parser = argparse.ArgumentParser(description='Real-time Scene Text Detection with Differentiable Binarization (https://arxiv.org/abs/1911.08947).')
35
+ parser.add_argument('--input', '-i', type=str, help='Usage: Set path to the input image. Omit for using default camera.')
36
+ parser.add_argument('--model', '-m', type=str, default='text_detection_DB_TD500_resnet18_2021sep.onnx', help='Usage: Set model path, defaults to text_detection_DB_TD500_resnet18_2021sep.onnx.')
37
  parser.add_argument('--backend', '-b', type=int, default=backends[0], help=help_msg_backends.format(*backends))
38
  parser.add_argument('--target', '-t', type=int, default=targets[0], help=help_msg_targets.format(*targets))
39
  parser.add_argument('--width', type=int, default=736,
40
+ help='Usage: Resize input image to certain width, default = 736. It should be multiple by 32.')
41
  parser.add_argument('--height', type=int, default=736,
42
+ help='Usage: Resize input image to certain height, default = 736. It should be multiple by 32.')
43
+ parser.add_argument('--binary_threshold', type=float, default=0.3, help='Usage: Threshold of the binary map, default = 0.3.')
44
+ parser.add_argument('--polygon_threshold', type=float, default=0.5, help='Usage: Threshold of polygons, default = 0.5.')
45
+ parser.add_argument('--max_candidates', type=int, default=200, help='Usage: Set maximum number of polygon candidates, default = 200.')
46
+ parser.add_argument('--unclip_ratio', type=np.float64, default=2.0, help=' Usage: The unclip ratio of the detected text region, which determines the output size, default = 2.0.')
47
+ parser.add_argument('--save', '-s', type=str, default=False, help='Usage: Set “True” to save file with results (i.e. bounding box, confidence level). Invalid in case of camera input. Default will be set to “False”.')
48
+ parser.add_argument('--vis', '-v', type=str2bool, default=True, help='Usage: Default will be set to “True” and will open a new window to show results. Set to “False” to stop visualizations from being shown. Invalid in case of camera input.')
49
  args = parser.parse_args()
50
 
51
  def visualize(image, results, box_color=(0, 255, 0), text_color=(0, 0, 255), isClosed=True, thickness=2, fps=None):
models/text_recognition_crnn/README.md CHANGED
@@ -5,7 +5,7 @@ An End-to-End Trainable Neural Network for Image-based Sequence Recognition and
5
  Results of accuracy evaluation with [tools/eval](../../tools/eval) at different text recognition datasets.
6
 
7
  | Model name | ICDAR03(%) | IIIT5k(%) | CUTE80(%) |
8
- |--------------|------------|-----------|-----------|
9
  | CRNN_EN | 81.66 | 74.33 | 52.78 |
10
  | CRNN_EN_FP16 | 82.01 | 74.93 | 52.34 |
11
  | CRNN_EN_INT8 | 81.75 | 75.33 | 52.43 |
@@ -16,10 +16,11 @@ Results of accuracy evaluation with [tools/eval](../../tools/eval) at different
16
  \*: 'FP16' or 'INT8' stands for 'model quantized into FP16' or 'model quantized into int8'
17
 
18
  Note:
 
19
  - Model source:
20
- - `text_recognition_CRNN_EN_2021sep.onnx`: https://docs.opencv.org/4.5.2/d9/d1e/tutorial_dnn_OCR.html (CRNN_VGG_BiLSTM_CTC.onnx)
21
- - `text_recognition_CRNN_CH_2021sep.onnx`: https://docs.opencv.org/4.x/d4/d43/tutorial_dnn_text_spotting.html (crnn_cs.onnx)
22
- - `text_recognition_CRNN_CN_2021nov.onnx`: https://docs.opencv.org/4.5.2/d4/d43/tutorial_dnn_text_spotting.html (crnn_cs_CN.onnx)
23
  - `text_recognition_CRNN_EN_2021sep.onnx` can detect digits (0\~9) and letters (return lowercase letters a\~z) (view `charset_36_EN.txt` for details).
24
  - `text_recognition_CRNN_CH_2021sep.onnx` can detect digits (0\~9), upper/lower-case letters (a\~z and A\~Z), and some special characters (view `charset_94_CH.txt` for details).
25
  - `text_recognition_CRNN_CN_2021nov.onnx` can detect digits (0\~9), upper/lower-case letters (a\~z and A\~Z), some Chinese characters and some special characters (view `charset_3944_CN.txt` for details).
@@ -28,26 +29,35 @@ Note:
28
  ## Demo
29
 
30
  ***NOTE***:
 
31
  - This demo uses [text_detection_db](../text_detection_db) as text detector.
32
  - Selected model must match with the charset:
33
- - Try `text_recognition_CRNN_EN_2021sep.onnx` with `charset_36_EN.txt`.
34
- - Try `text_recognition_CRNN_CH_2021sep.onnx` with `charset_94_CH.txt`
35
- - Try `text_recognition_CRNN_CN_2021sep.onnx` with `charset_3944_CN.txt`.
36
 
37
  Run the demo detecting English:
 
38
  ```shell
39
  # detect on camera input
40
  python demo.py
41
  # detect on an image
42
  python demo.py --input /path/to/image
 
 
 
43
  ```
44
 
45
  Run the demo detecting Chinese:
 
46
  ```shell
47
  # detect on camera input
48
  python demo.py --model text_recognition_CRNN_CN_2021nov.onnx --charset charset_3944_CN.txt
49
  # detect on an image
50
  python demo.py --input /path/to/image --model text_recognition_CRNN_CN_2021nov.onnx --charset charset_3944_CN.txt
 
 
 
51
  ```
52
 
53
  ### Examples
 
5
  Results of accuracy evaluation with [tools/eval](../../tools/eval) at different text recognition datasets.
6
 
7
  | Model name | ICDAR03(%) | IIIT5k(%) | CUTE80(%) |
8
+ | ------------ | ---------- | --------- | --------- |
9
  | CRNN_EN | 81.66 | 74.33 | 52.78 |
10
  | CRNN_EN_FP16 | 82.01 | 74.93 | 52.34 |
11
  | CRNN_EN_INT8 | 81.75 | 75.33 | 52.43 |
 
16
  \*: 'FP16' or 'INT8' stands for 'model quantized into FP16' or 'model quantized into int8'
17
 
18
  Note:
19
+
20
  - Model source:
21
+ - `text_recognition_CRNN_EN_2021sep.onnx`: https://docs.opencv.org/4.5.2/d9/d1e/tutorial_dnn_OCR.html (CRNN_VGG_BiLSTM_CTC.onnx)
22
+ - `text_recognition_CRNN_CH_2021sep.onnx`: https://docs.opencv.org/4.x/d4/d43/tutorial_dnn_text_spotting.html (crnn_cs.onnx)
23
+ - `text_recognition_CRNN_CN_2021nov.onnx`: https://docs.opencv.org/4.5.2/d4/d43/tutorial_dnn_text_spotting.html (crnn_cs_CN.onnx)
24
  - `text_recognition_CRNN_EN_2021sep.onnx` can detect digits (0\~9) and letters (return lowercase letters a\~z) (view `charset_36_EN.txt` for details).
25
  - `text_recognition_CRNN_CH_2021sep.onnx` can detect digits (0\~9), upper/lower-case letters (a\~z and A\~Z), and some special characters (view `charset_94_CH.txt` for details).
26
  - `text_recognition_CRNN_CN_2021nov.onnx` can detect digits (0\~9), upper/lower-case letters (a\~z and A\~Z), some Chinese characters and some special characters (view `charset_3944_CN.txt` for details).
 
29
  ## Demo
30
 
31
  ***NOTE***:
32
+
33
  - This demo uses [text_detection_db](../text_detection_db) as text detector.
34
  - Selected model must match with the charset:
35
+ - Try `text_recognition_CRNN_EN_2021sep.onnx` with `charset_36_EN.txt`.
36
+ - Try `text_recognition_CRNN_CH_2021sep.onnx` with `charset_94_CH.txt`
37
+ - Try `text_recognition_CRNN_CN_2021sep.onnx` with `charset_3944_CN.txt`.
38
 
39
  Run the demo detecting English:
40
+
41
  ```shell
42
  # detect on camera input
43
  python demo.py
44
  # detect on an image
45
  python demo.py --input /path/to/image
46
+
47
+ # get help regarding various parameters
48
+ python demo.py --help
49
  ```
50
 
51
  Run the demo detecting Chinese:
52
+
53
  ```shell
54
  # detect on camera input
55
  python demo.py --model text_recognition_CRNN_CN_2021nov.onnx --charset charset_3944_CN.txt
56
  # detect on an image
57
  python demo.py --input /path/to/image --model text_recognition_CRNN_CN_2021nov.onnx --charset charset_3944_CN.txt
58
+
59
+ # get help regarding various parameters
60
+ python demo.py --help
61
  ```
62
 
63
  ### Examples
models/text_recognition_crnn/demo.py CHANGED
@@ -33,17 +33,17 @@ try:
33
  help_msg_backends += "; {:d}: TIMVX"
34
  help_msg_targets += "; {:d}: NPU"
35
  except:
36
- print('This version of OpenCV does not support TIM-VX and NPU. Visit https://gist.github.com/fengyuentau/5a7a5ba36328f2b763aea026c43fa45f for more information.')
37
 
38
  parser = argparse.ArgumentParser(
39
  description="An End-to-End Trainable Neural Network for Image-based Sequence Recognition and Its Application to Scene Text Recognition (https://arxiv.org/abs/1507.05717)")
40
- parser.add_argument('--input', '-i', type=str, help='Path to the input image. Omit for using default camera.')
41
- parser.add_argument('--model', '-m', type=str, default='text_recognition_CRNN_EN_2021sep.onnx', help='Path to the model.')
42
  parser.add_argument('--backend', '-b', type=int, default=backends[0], help=help_msg_backends.format(*backends))
43
  parser.add_argument('--target', '-t', type=int, default=targets[0], help=help_msg_targets.format(*targets))
44
- parser.add_argument('--charset', '-c', type=str, default='charset_36_EN.txt', help='Path to the charset file corresponding to the selected model.')
45
- parser.add_argument('--save', '-s', type=str, default=False, help='Set true to save results. This flag is invalid when using camera.')
46
- parser.add_argument('--vis', '-v', type=str2bool, default=True, help='Set true to open a window for result visualization. This flag is invalid when using camera.')
47
  parser.add_argument('--width', type=int, default=736,
48
  help='Preprocess input image by resizing to a specific width. It should be multiple by 32.')
49
  parser.add_argument('--height', type=int, default=736,
 
33
  help_msg_backends += "; {:d}: TIMVX"
34
  help_msg_targets += "; {:d}: NPU"
35
  except:
36
+ print('This version of OpenCV does not support TIM-VX and NPU. Visit https://github.com/opencv/opencv/wiki/TIM-VX-Backend-For-Running-OpenCV-On-NPU for more information.')
37
 
38
  parser = argparse.ArgumentParser(
39
  description="An End-to-End Trainable Neural Network for Image-based Sequence Recognition and Its Application to Scene Text Recognition (https://arxiv.org/abs/1507.05717)")
40
+ parser.add_argument('--input', '-i', type=str, help='Usage: Set path to the input image. Omit for using default camera.')
41
+ parser.add_argument('--model', '-m', type=str, default='text_recognition_CRNN_EN_2021sep.onnx', help='Usage: Set model path, defaults to text_recognition_CRNN_EN_2021sep.onnx.')
42
  parser.add_argument('--backend', '-b', type=int, default=backends[0], help=help_msg_backends.format(*backends))
43
  parser.add_argument('--target', '-t', type=int, default=targets[0], help=help_msg_targets.format(*targets))
44
+ parser.add_argument('--charset', '-c', type=str, default='charset_36_EN.txt', help='Usage: Set the path to the charset file corresponding to the selected model.')
45
+ parser.add_argument('--save', '-s', type=str, default=False, help='Usage: Set “True” to save a file with results. Invalid in case of camera input. Default will be set to “False”.')
46
+ parser.add_argument('--vis', '-v', type=str2bool, default=True, help='Usage: Default will be set to “True” and will open a new window to show results. Set to “False” to stop visualizations from being shown. Invalid in case of camera input.')
47
  parser.add_argument('--width', type=int, default=736,
48
  help='Preprocess input image by resizing to a specific width. It should be multiple by 32.')
49
  parser.add_argument('--height', type=int, default=736,