This model does not currently work with native ComfyUI INT8, and will need to be converted with this tool.
Unless you already downloaded this model, I recommend finding a more recent INT8 quant from Silveroxides to avoid having to do this. converting this model actually.
Don't ask me why but the Silveroxides int8 convrot simple is performing worse than my converted version.
Evidence from my measurements:
8 Samples, 512x512, 20 steps each:
| Metric | INT8Bob | INT8Silver |
|---|---|---|
| MSE ↓ | 0.21857 ±0.01951 ★ |
0.43413 ±0.05470 |
| MAE ↓ | 0.27689 ±0.01402 ★ |
0.42484 ±0.02914 |
| Max err ↓ | 4.60856 ±0.16847 ★ |
5.71161 ±0.20756 |
| Rel-RMSE ↓ | 0.42217 ±0.02033 ★ |
0.61513 ±0.04038 |
| SNR dB ↑ | 9.10 ±0.46 ★ |
4.95 ±0.56 |
16 Samples, 256x256, 20 steps each:
| Metric | INT8Johnson | INT8_Silver |
|---|---|---|
| MSE ↓ | 0.37593 ±0.04121 ★ |
0.59410 ±0.04671 |
| MAE ↓ | 0.39571 ±0.02467 ★ |
0.52660 ±0.02362 |
| Max err ↓ | 4.65401 ±0.19794 ★ |
5.48461 ±0.16060 |
| Rel-RMSE ↓ | 0.54103 ±0.03184 ★ |
0.70424 ±0.02955 |
| SNR dB ↑ | 6.44 ±0.55 ★ |
3.53 ±0.38 |
This is an unavoidable double quantization due to the release state of Ideogram4.
The FP8 weights were cast to FP32 with the FP8 scales, then downcast to BF16 before being converted to INT8.
For use in ComfyUI with https://github.com/BobJohnson24/ComfyUI-INT8-Fast
Speed is 1.78x faster(2.03s/it) than FP8(3.62s/it) on my 3090, without compile.
~2x faster with torch compile.
After further inspection, it appears there may be quality issues with torch compiling this model.
Quick comparison:
