Add/update the quantized ONNX model files and README.md for Transformers.js v3
#5
by
whitphx
HF Staff
- opened
Applied Quantizations
β
Based on decoder_with_past_model.onnx with slimming
β³ β
q4f16: decoder_with_past_model_q4f16.onnx (added)
β
Based on decoder_model.onnx with slimming
β³ β
q4f16: decoder_model_q4f16.onnx (added)
β
Based on encoder_model.onnx with slimming
β³ β
q4f16: encoder_model_q4f16.onnx (added)
β
Based on decoder_model_merged.onnx without slimming
β³ β
fp16: decoder_model_merged_fp16.onnx (replaced because it was invalid)
β³ β
q4f16: decoder_model_merged_q4f16.onnx (added)
whitphx
changed pull request status to
closed