Update model card with new training details
Browse files
README.md
CHANGED
|
@@ -63,11 +63,49 @@ Ergo, at 1300 steps, the decision was made to cease training on the original LAI
|
|
| 63 |
|
| 64 |
This consisted of 17,800 images at a base resolution of 1024x1024, with about 700 samples in portrait and 700 samples in landscape.
|
| 65 |
|
| 66 |
-
##
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 67 |
|
| 68 |
Similar to the text encoder swap, the images showed a marked improvement over the next several checkpoints.
|
| 69 |
|
| 70 |
-
|
|
|
|
|
|
|
|
|
|
| 71 |
|
| 72 |
This model has been packaged up in a test form so that it can be thoroughly assessed by users.
|
| 73 |
|
|
|
|
| 63 |
|
| 64 |
This consisted of 17,800 images at a base resolution of 1024x1024, with about 700 samples in portrait and 700 samples in landscape.
|
| 65 |
|
| 66 |
+
## Contrast issues
|
| 67 |
+
|
| 68 |
+
As the checkpoint 3275 was tested, a common observation was that darker images were washed out, and brighter images seemed "meh".
|
| 69 |
+
|
| 70 |
+
Various CFG rescale and guidance levels were tested, with the best dark images occurring around `guidance_scale=9.2` and `guidance_rescale=0.0` but they remained "washed out".
|
| 71 |
+
|
| 72 |
+
## Dataset change number two
|
| 73 |
+
|
| 74 |
+
A new LAION subset was prepared with unique images and no square images - just a limited collection of aspect ratios:
|
| 75 |
+
|
| 76 |
+
* 16:9
|
| 77 |
+
* 9:16
|
| 78 |
+
* 2:3
|
| 79 |
+
* 3:2
|
| 80 |
+
|
| 81 |
+
This was intended to speed up the understanding of the model, and prevent overfitting on captions.
|
| 82 |
+
|
| 83 |
+
This LAION subset contained 17,800 images, evenly distributed through aspect ratios.
|
| 84 |
+
|
| 85 |
+
The images were then captioned using T5 Flan with BLIP2, to obtain highly accurate results.
|
| 86 |
+
|
| 87 |
+
## Contrast fix: offset noise / SNR gamma to the rescue?
|
| 88 |
+
|
| 89 |
+
Offset noise and SNR gamma were applied experimentally to the checkpoint **4250**:
|
| 90 |
+
|
| 91 |
+
* `snr_gamma=5.0`
|
| 92 |
+
* `noise_offset=0.2`
|
| 93 |
+
* `noise_pertubation=0.1`
|
| 94 |
+
|
| 95 |
+
Within 25 steps of training, the contrast was back, and the prompt `a solid black square` once again produced a reasonable result.
|
| 96 |
+
|
| 97 |
+
At 50 steps of offset noise, things really seemed to "click" and `a solid black square` had the fewest deformities I've seen.
|
| 98 |
+
|
| 99 |
+
Step 75 checkpoint was broken. The SNR gamma math results in numeric instability and was disabled. The offset noise parameters were untouched.
|
| 100 |
+
|
| 101 |
+
## Success! Improvement in quality and contrast.
|
| 102 |
|
| 103 |
Similar to the text encoder swap, the images showed a marked improvement over the next several checkpoints.
|
| 104 |
|
| 105 |
+
It was left to its own devices, and at step 4475, enough improvement was observed that another revision in this repository was created.
|
| 106 |
+
|
| 107 |
+
|
| 108 |
+
# Status: Test release
|
| 109 |
|
| 110 |
This model has been packaged up in a test form so that it can be thoroughly assessed by users.
|
| 111 |
|