File size: 1,610 Bytes
86cd32d 55654c5 86cd32d 55654c5 e5b568e 55654c5 e5b568e 55654c5 e5b568e 55654c5 e5b568e 55654c5 e5b568e 86cd32d |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 |
# Quantization with ONNXRUNTIME and Neural Compressor
[ONNXRUNTIME](https://github.com/microsoft/onnxruntime) and [Neural Compressor](https://github.com/intel/neural-compressor) are used for quantization in the Zoo.
Install dependencies before trying quantization:
```shell
pip install -r requirements.txt
```
## Usage
Quantize all models in the Zoo:
```shell
python quantize-ort.py
python quantize-inc.py
```
Quantize one of the models in the Zoo:
```shell
# python quantize.py <key_in_models>
python quantize-ort.py yunet
python quantize-inc.py mobilenetv1
```
Customizing quantization configs:
```python
# Quantize with ONNXRUNTIME
# 1. add your model into `models` dict in quantize-ort.py
models = dict(
# ...
model1=Quantize(model_path='/path/to/model1.onnx',
calibration_image_dir='/path/to/images',
transforms=Compose([''' transforms ''']), # transforms can be found in transforms.py
per_channel=False, # set False to quantize in per-tensor style
act_type='int8', # available types: 'int8', 'uint8'
wt_type='int8' # available types: 'int8', 'uint8'
)
)
# 2. quantize your model
python quantize-ort.py model1
# Quantize with Intel Neural Compressor
# 1. add your model into `models` dict in quantize-inc.py
models = dict(
# ...
model1=Quantize(model_path='/path/to/model1.onnx',
config_path='/path/to/model1.yaml'),
)
# 2. prepare your YAML config model1.yaml (see configs in ./inc_configs)
# 3. quantize your model
python quantize-inc.py model1
```
|