Add/update the quantized ONNX model files and README.md for Transformers.js v3
#3
by
whitphx
HF Staff
- opened
Applied Quantizations
β
Based on decoder_with_past_model.onnx with slimming
β³ q4f16 (added)
β
Based on decoder_model.onnx with slimming
β³ q4f16 (added)
β
Based on encoder_model.onnx with slimming
β³ q4f16 (added)
β
Based on decoder_model_merged.onnx without slimming
β³ fp16 (replaced because it was invalid)
β³ q4f16 (added)
whitphx
changed pull request status to
closed