Re-quantify some models from per_channel mode to per_tensor mode (#90)

Browse files

* re-quantize some models from per_channel mode to per_tensor mode

* remove the description about per_channel

Files changed (6) hide show

README.md +3 -3
models/handpose_estimation_mediapipe/README.md +0 -2
models/palm_detection_mediapipe/README.md +0 -2
tools/quantize/inc_configs/lpd_yunet.yaml +12 -0
tools/quantize/inc_configs/mp_handpose.yaml +12 -0
tools/quantize/inc_configs/mp_palmdet.yaml +12 -0

README.md CHANGED Viewed

@@ -19,7 +19,7 @@ Guidelines:
 | ---------------------------------------------------- | ----------------------------- | ---------- | -------------- | ------------ | --------------- | ------------ | ----------- |
 | [YuNet](./models/face_detection_yunet)                  | Face Detection                | 160x120    | 1.45           | 6.22         | 12.18           | 4.04         | 86.69       |
 | [SFace](./models/face_recognition_sface)                | Face Recognition              | 112x112    | 8.65           | 99.20        | 24.88           | 46.25        | ---         |
-| [LPD-YuNet](./models/license_plate_detection_yunet/)    | License Plate Detection       | 320x240    | ---            | 168.03       | 56.12           | 154.20\*     |             |
 | [DB-IC15](./models/text_detection_db)                   | Text Detection                | 640x480    | 142.91         | 2835.91      | 208.41          | ---          | ---         |
 | [DB-TD500](./models/text_detection_db)                  | Text Detection                | 640x480    | 142.91         | 2841.71      | 210.51          | ---          | ---         |
 | [CRNN-EN](./models/text_recognition_crnn)               | Text Recognition              | 100x32     | 50.21          | 234.32       | 196.15          | 125.30       | ---         |
@@ -31,8 +31,8 @@ Guidelines:
 | [WeChatQRCode](./models/qrcode_wechatqrcode)            | QR Code Detection and Parsing | 100x100    | 7.04           | 37.68        | ---             | ---          | ---         |
 | [DaSiamRPN](./models/object_tracking_dasiamrpn)         | Object Tracking               | 1280x720   | 36.15          | 705.48       | 76.82           | ---          | ---         |
 | [YoutuReID](./models/person_reid_youtureid)             | Person Re-Identification      | 128x256    | 35.81          | 521.98       | 90.07           | 44.61        | ---         |
-| [MP-PalmDet](./models/palm_detection_mediapipe)         | Palm Detection                | 256x256    | 15.57          | 168.37       | 50.64           | 145.56\*     | ---         |
-| [MP-HandPose](./models/handpose_estimation_mediapipe)   | Hand Pose Estimation          | 256x256    | 20.16          | 148.24       | 156.30          | 663.77\*     | ---         |
 \*: Models are quantized in per-channel mode, which run slower than per-tensor quantized models on NPU.

 | ---------------------------------------------------- | ----------------------------- | ---------- | -------------- | ------------ | --------------- | ------------ | ----------- |
 | [YuNet](./models/face_detection_yunet)                  | Face Detection                | 160x120    | 1.45           | 6.22         | 12.18           | 4.04         | 86.69       |
 | [SFace](./models/face_recognition_sface)                | Face Recognition              | 112x112    | 8.65           | 99.20        | 24.88           | 46.25        | ---         |
+| [LPD-YuNet](./models/license_plate_detection_yunet/)    | License Plate Detection       | 320x240    | ---            | 168.03       | 56.12           | 29.53        |             |
 | [DB-IC15](./models/text_detection_db)                   | Text Detection                | 640x480    | 142.91         | 2835.91      | 208.41          | ---          | ---         |
 | [DB-TD500](./models/text_detection_db)                  | Text Detection                | 640x480    | 142.91         | 2841.71      | 210.51          | ---          | ---         |
 | [CRNN-EN](./models/text_recognition_crnn)               | Text Recognition              | 100x32     | 50.21          | 234.32       | 196.15          | 125.30       | ---         |
 | [WeChatQRCode](./models/qrcode_wechatqrcode)            | QR Code Detection and Parsing | 100x100    | 7.04           | 37.68        | ---             | ---          | ---         |
 | [DaSiamRPN](./models/object_tracking_dasiamrpn)         | Object Tracking               | 1280x720   | 36.15          | 705.48       | 76.82           | ---          | ---         |
 | [YoutuReID](./models/person_reid_youtureid)             | Person Re-Identification      | 128x256    | 35.81          | 521.98       | 90.07           | 44.61        | ---         |
+| [MP-PalmDet](./models/palm_detection_mediapipe)         | Palm Detection                | 256x256    | 15.57          | 168.37       | 50.64           | 62.45        | ---         |
+| [MP-HandPose](./models/handpose_estimation_mediapipe)   | Hand Pose Estimation          | 256x256    | 20.16          | 148.24       | 156.30          | 42.70        | ---         |
 \*: Models are quantized in per-channel mode, which run slower than per-tensor quantized models on NPU.

models/handpose_estimation_mediapipe/README.md CHANGED Viewed

@@ -9,8 +9,6 @@ This model is converted from Tensorflow-JS to ONNX using following tools:
 - tf_saved_model to ONNX: https://github.com/onnx/tensorflow-onnx
 - simplified by [onnx-simplifier](https://github.com/daquexian/onnx-simplifier)
-Also note that the model is quantized in per-channel mode with [Intel's neural compressor](https://github.com/intel/neural-compressor), which gives better accuracy but may lose some speed.
 ## Demo
 Run the following commands to try the demo:

 - tf_saved_model to ONNX: https://github.com/onnx/tensorflow-onnx
 - simplified by [onnx-simplifier](https://github.com/daquexian/onnx-simplifier)
 ## Demo
 Run the following commands to try the demo:

models/palm_detection_mediapipe/README.md CHANGED Viewed

@@ -6,8 +6,6 @@ This model detects palm bounding boxes and palm landmarks, and is converted from
 - tf_saved_model to ONNX: https://github.com/onnx/tensorflow-onnx
 - simplified by [onnx-simplifier](https://github.com/daquexian/onnx-simplifier)
-Also note that the model is quantized in per-channel mode with [Intel&#39;s neural compressor](https://github.com/intel/neural-compressor), which gives better accuracy but may lose some speed.
 ## Demo
 Run the following commands to try the demo:

 - tf_saved_model to ONNX: https://github.com/onnx/tensorflow-onnx
 - simplified by [onnx-simplifier](https://github.com/daquexian/onnx-simplifier)
 ## Demo
 Run the following commands to try the demo:

tools/quantize/inc_configs/lpd_yunet.yaml CHANGED Viewed

@@ -32,6 +32,18 @@ quantization:                                        # optional. tuning constrai
           dtype: float32
           label: True
 tuning:
   accuracy_criterion:
     relative:  0.02                                  # optional. default value is relative, other value is absolute. this example allows relative accuracy loss: 1%.

           dtype: float32
           label: True
+  model_wise:                                        # optional. tuning constraints on model-wise for advance user to reduce tuning space.
+    weight:
+      granularity: per_tensor
+      scheme: asym
+      dtype: int8
+      algorithm: minmax
+    activation:
+      granularity: per_tensor
+      scheme: asym
+      dtype: int8
+      algorithm: minmax
 tuning:
   accuracy_criterion:
     relative:  0.02                                  # optional. default value is relative, other value is absolute. this example allows relative accuracy loss: 1%.

tools/quantize/inc_configs/mp_handpose.yaml CHANGED Viewed

@@ -32,6 +32,18 @@ quantization:                                        # optional. tuning constrai
           dtype: float32
           label: True
 tuning:
   accuracy_criterion:
     relative:  0.02                                  # optional. default value is relative, other value is absolute. this example allows relative accuracy loss: 1%.

           dtype: float32
           label: True
+  model_wise:                                        # optional. tuning constraints on model-wise for advance user to reduce tuning space.
+    weight:
+      granularity: per_tensor
+      scheme: asym
+      dtype: int8
+      algorithm: minmax
+    activation:
+      granularity: per_tensor
+      scheme: asym
+      dtype: int8
+      algorithm: minmax
 tuning:
   accuracy_criterion:
     relative:  0.02                                  # optional. default value is relative, other value is absolute. this example allows relative accuracy loss: 1%.

tools/quantize/inc_configs/mp_palmdet.yaml CHANGED Viewed

@@ -32,6 +32,18 @@ quantization:                                        # optional. tuning constrai
           dtype: float32
           label: True
 tuning:
   accuracy_criterion:
     relative:  0.02                                  # optional. default value is relative, other value is absolute. this example allows relative accuracy loss: 1%.

           dtype: float32
           label: True
+  model_wise:                                        # optional. tuning constraints on model-wise for advance user to reduce tuning space.
+    weight:
+      granularity: per_tensor
+      scheme: asym
+      dtype: int8
+      algorithm: minmax
+    activation:
+      granularity: per_tensor
+      scheme: asym
+      dtype: int8
+      algorithm: minmax
 tuning:
   accuracy_criterion:
     relative:  0.02                                  # optional. default value is relative, other value is absolute. this example allows relative accuracy loss: 1%.