Wanli commited on
Commit
323da84
·
1 Parent(s): 90edc6d

Re-quantify some models from per_channel mode to per_tensor mode (#90)

Browse files

* re-quantize some models from per_channel mode to per_tensor mode

* remove the description about per_channel

README.md CHANGED
@@ -19,7 +19,7 @@ Guidelines:
19
  | ---------------------------------------------------- | ----------------------------- | ---------- | -------------- | ------------ | --------------- | ------------ | ----------- |
20
  | [YuNet](./models/face_detection_yunet) | Face Detection | 160x120 | 1.45 | 6.22 | 12.18 | 4.04 | 86.69 |
21
  | [SFace](./models/face_recognition_sface) | Face Recognition | 112x112 | 8.65 | 99.20 | 24.88 | 46.25 | --- |
22
- | [LPD-YuNet](./models/license_plate_detection_yunet/) | License Plate Detection | 320x240 | --- | 168.03 | 56.12 | 154.20\* | |
23
  | [DB-IC15](./models/text_detection_db) | Text Detection | 640x480 | 142.91 | 2835.91 | 208.41 | --- | --- |
24
  | [DB-TD500](./models/text_detection_db) | Text Detection | 640x480 | 142.91 | 2841.71 | 210.51 | --- | --- |
25
  | [CRNN-EN](./models/text_recognition_crnn) | Text Recognition | 100x32 | 50.21 | 234.32 | 196.15 | 125.30 | --- |
@@ -31,8 +31,8 @@ Guidelines:
31
  | [WeChatQRCode](./models/qrcode_wechatqrcode) | QR Code Detection and Parsing | 100x100 | 7.04 | 37.68 | --- | --- | --- |
32
  | [DaSiamRPN](./models/object_tracking_dasiamrpn) | Object Tracking | 1280x720 | 36.15 | 705.48 | 76.82 | --- | --- |
33
  | [YoutuReID](./models/person_reid_youtureid) | Person Re-Identification | 128x256 | 35.81 | 521.98 | 90.07 | 44.61 | --- |
34
- | [MP-PalmDet](./models/palm_detection_mediapipe) | Palm Detection | 256x256 | 15.57 | 168.37 | 50.64 | 145.56\* | --- |
35
- | [MP-HandPose](./models/handpose_estimation_mediapipe) | Hand Pose Estimation | 256x256 | 20.16 | 148.24 | 156.30 | 663.77\* | --- |
36
 
37
  \*: Models are quantized in per-channel mode, which run slower than per-tensor quantized models on NPU.
38
 
 
19
  | ---------------------------------------------------- | ----------------------------- | ---------- | -------------- | ------------ | --------------- | ------------ | ----------- |
20
  | [YuNet](./models/face_detection_yunet) | Face Detection | 160x120 | 1.45 | 6.22 | 12.18 | 4.04 | 86.69 |
21
  | [SFace](./models/face_recognition_sface) | Face Recognition | 112x112 | 8.65 | 99.20 | 24.88 | 46.25 | --- |
22
+ | [LPD-YuNet](./models/license_plate_detection_yunet/) | License Plate Detection | 320x240 | --- | 168.03 | 56.12 | 29.53 | |
23
  | [DB-IC15](./models/text_detection_db) | Text Detection | 640x480 | 142.91 | 2835.91 | 208.41 | --- | --- |
24
  | [DB-TD500](./models/text_detection_db) | Text Detection | 640x480 | 142.91 | 2841.71 | 210.51 | --- | --- |
25
  | [CRNN-EN](./models/text_recognition_crnn) | Text Recognition | 100x32 | 50.21 | 234.32 | 196.15 | 125.30 | --- |
 
31
  | [WeChatQRCode](./models/qrcode_wechatqrcode) | QR Code Detection and Parsing | 100x100 | 7.04 | 37.68 | --- | --- | --- |
32
  | [DaSiamRPN](./models/object_tracking_dasiamrpn) | Object Tracking | 1280x720 | 36.15 | 705.48 | 76.82 | --- | --- |
33
  | [YoutuReID](./models/person_reid_youtureid) | Person Re-Identification | 128x256 | 35.81 | 521.98 | 90.07 | 44.61 | --- |
34
+ | [MP-PalmDet](./models/palm_detection_mediapipe) | Palm Detection | 256x256 | 15.57 | 168.37 | 50.64 | 62.45 | --- |
35
+ | [MP-HandPose](./models/handpose_estimation_mediapipe) | Hand Pose Estimation | 256x256 | 20.16 | 148.24 | 156.30 | 42.70 | --- |
36
 
37
  \*: Models are quantized in per-channel mode, which run slower than per-tensor quantized models on NPU.
38
 
models/handpose_estimation_mediapipe/README.md CHANGED
@@ -9,8 +9,6 @@ This model is converted from Tensorflow-JS to ONNX using following tools:
9
  - tf_saved_model to ONNX: https://github.com/onnx/tensorflow-onnx
10
  - simplified by [onnx-simplifier](https://github.com/daquexian/onnx-simplifier)
11
 
12
- Also note that the model is quantized in per-channel mode with [Intel's neural compressor](https://github.com/intel/neural-compressor), which gives better accuracy but may lose some speed.
13
-
14
  ## Demo
15
 
16
  Run the following commands to try the demo:
 
9
  - tf_saved_model to ONNX: https://github.com/onnx/tensorflow-onnx
10
  - simplified by [onnx-simplifier](https://github.com/daquexian/onnx-simplifier)
11
 
 
 
12
  ## Demo
13
 
14
  Run the following commands to try the demo:
models/palm_detection_mediapipe/README.md CHANGED
@@ -6,8 +6,6 @@ This model detects palm bounding boxes and palm landmarks, and is converted from
6
  - tf_saved_model to ONNX: https://github.com/onnx/tensorflow-onnx
7
  - simplified by [onnx-simplifier](https://github.com/daquexian/onnx-simplifier)
8
 
9
- Also note that the model is quantized in per-channel mode with [Intel's neural compressor](https://github.com/intel/neural-compressor), which gives better accuracy but may lose some speed.
10
-
11
  ## Demo
12
 
13
  Run the following commands to try the demo:
 
6
  - tf_saved_model to ONNX: https://github.com/onnx/tensorflow-onnx
7
  - simplified by [onnx-simplifier](https://github.com/daquexian/onnx-simplifier)
8
 
 
 
9
  ## Demo
10
 
11
  Run the following commands to try the demo:
tools/quantize/inc_configs/lpd_yunet.yaml CHANGED
@@ -32,6 +32,18 @@ quantization: # optional. tuning constrai
32
  dtype: float32
33
  label: True
34
 
 
 
 
 
 
 
 
 
 
 
 
 
35
  tuning:
36
  accuracy_criterion:
37
  relative: 0.02 # optional. default value is relative, other value is absolute. this example allows relative accuracy loss: 1%.
 
32
  dtype: float32
33
  label: True
34
 
35
+ model_wise: # optional. tuning constraints on model-wise for advance user to reduce tuning space.
36
+ weight:
37
+ granularity: per_tensor
38
+ scheme: asym
39
+ dtype: int8
40
+ algorithm: minmax
41
+ activation:
42
+ granularity: per_tensor
43
+ scheme: asym
44
+ dtype: int8
45
+ algorithm: minmax
46
+
47
  tuning:
48
  accuracy_criterion:
49
  relative: 0.02 # optional. default value is relative, other value is absolute. this example allows relative accuracy loss: 1%.
tools/quantize/inc_configs/mp_handpose.yaml CHANGED
@@ -32,6 +32,18 @@ quantization: # optional. tuning constrai
32
  dtype: float32
33
  label: True
34
 
 
 
 
 
 
 
 
 
 
 
 
 
35
  tuning:
36
  accuracy_criterion:
37
  relative: 0.02 # optional. default value is relative, other value is absolute. this example allows relative accuracy loss: 1%.
 
32
  dtype: float32
33
  label: True
34
 
35
+ model_wise: # optional. tuning constraints on model-wise for advance user to reduce tuning space.
36
+ weight:
37
+ granularity: per_tensor
38
+ scheme: asym
39
+ dtype: int8
40
+ algorithm: minmax
41
+ activation:
42
+ granularity: per_tensor
43
+ scheme: asym
44
+ dtype: int8
45
+ algorithm: minmax
46
+
47
  tuning:
48
  accuracy_criterion:
49
  relative: 0.02 # optional. default value is relative, other value is absolute. this example allows relative accuracy loss: 1%.
tools/quantize/inc_configs/mp_palmdet.yaml CHANGED
@@ -32,6 +32,18 @@ quantization: # optional. tuning constrai
32
  dtype: float32
33
  label: True
34
 
 
 
 
 
 
 
 
 
 
 
 
 
35
  tuning:
36
  accuracy_criterion:
37
  relative: 0.02 # optional. default value is relative, other value is absolute. this example allows relative accuracy loss: 1%.
 
32
  dtype: float32
33
  label: True
34
 
35
+ model_wise: # optional. tuning constraints on model-wise for advance user to reduce tuning space.
36
+ weight:
37
+ granularity: per_tensor
38
+ scheme: asym
39
+ dtype: int8
40
+ algorithm: minmax
41
+ activation:
42
+ granularity: per_tensor
43
+ scheme: asym
44
+ dtype: int8
45
+ algorithm: minmax
46
+
47
  tuning:
48
  accuracy_criterion:
49
  relative: 0.02 # optional. default value is relative, other value is absolute. this example allows relative accuracy loss: 1%.