Quantization with ONNXRUNTIME
ONNXRUNTIME is used for quantization in the Zoo.
Install dependencies before trying quantization:
pip install -r requirements.txt
Usage
Quantize all models in the Zoo:
python quantize.py
Quantize one of the models in the Zoo:
# python quantize.py <key_in_models>
python quantize.py yunet
Customizing quantization configs:
# add model into `models` dict in quantize.py
models = dict(
# ...
model1=Quantize(model_path='/path/to/model1.onnx'
calibration_image_dir='/path/to/images',
transforms=Compose([''' transforms ''']), # transforms can be found in transforms.py
per_channel=False, # set False to quantize in per-tensor style
act_type='int8', # available types: 'int8', 'uint8'
wt_type='int8' # available types: 'int8', 'uint8'
)
)
# quantize the added models
python quantize.py model1