bghira commited on
Commit
a48db75
·
verified ·
1 Parent(s): 5664117

Model card auto-generated by SimpleTuner

Browse files
Files changed (1) hide show
  1. README.md +132 -0
README.md ADDED
@@ -0,0 +1,132 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: other
3
+ base_model: "stabilityai/stable-diffusion-3.5-medium"
4
+ tags:
5
+ - sd3
6
+ - sd3-diffusers
7
+ - text-to-image
8
+ - image-to-image
9
+ - diffusers
10
+ - simpletuner
11
+ - not-for-all-audiences
12
+ - lora
13
+ - controlnet
14
+ - template:sd-lora
15
+ - standard
16
+ pipeline_tag: text-to-image
17
+ inference: true
18
+ widget:
19
+ - text: 'A photo-realistic image of a cat'
20
+ parameters:
21
+ negative_prompt: 'ugly, cropped, blurry, low-quality, mediocre average'
22
+ output:
23
+ url: ./assets/image_0_0.png
24
+ ---
25
+
26
+ # sd3-controlnet-lora-test
27
+
28
+ This is a ControlNet PEFT LoRA derived from [stabilityai/stable-diffusion-3.5-medium](https://huggingface.co/stabilityai/stable-diffusion-3.5-medium).
29
+
30
+ The main validation prompt used during training was:
31
+ ```
32
+ A photo-realistic image of a cat
33
+ ```
34
+
35
+
36
+ ## Validation settings
37
+ - CFG: `4.0`
38
+ - CFG Rescale: `0.0`
39
+ - Steps: `16`
40
+ - Sampler: `FlowMatchEulerDiscreteScheduler`
41
+ - Seed: `42`
42
+ - Resolution: `1024x1024`
43
+ - Skip-layer guidance:
44
+
45
+ Note: The validation settings are not necessarily the same as the [training settings](#training-settings).
46
+
47
+ You can find some example images in the following gallery:
48
+
49
+
50
+ <Gallery />
51
+
52
+ The text encoder **was not** trained.
53
+ You may reuse the base model text encoder for inference.
54
+
55
+
56
+ ## Training settings
57
+
58
+ - Training epochs: 1
59
+ - Training steps: 50
60
+ - Learning rate: 0.0001
61
+ - Learning rate schedule: constant
62
+ - Warmup steps: 500
63
+ - Max grad value: 2.0
64
+ - Effective batch size: 1
65
+ - Micro-batch size: 1
66
+ - Gradient accumulation steps: 1
67
+ - Number of GPUs: 1
68
+ - Gradient checkpointing: True
69
+ - Prediction type: flow_matching (extra parameters=['shift=3.0'])
70
+ - Optimizer: adamw_bf16
71
+ - Trainable parameter precision: Pure BF16
72
+ - Base model precision: `int8-quanto`
73
+ - Caption dropout probability: 0.0%
74
+
75
+
76
+ - LoRA Rank: 64
77
+ - LoRA Alpha: 64.0
78
+ - LoRA Dropout: 0.1
79
+ - LoRA initialisation style: default
80
+
81
+
82
+ ## Datasets
83
+
84
+ ### antelope-data-256
85
+ - Repeats: 0
86
+ - Total number of images: 29
87
+ - Total number of aspect buckets: 1
88
+ - Resolution: 0.065536 megapixels
89
+ - Cropped: True
90
+ - Crop style: center
91
+ - Crop aspect: square
92
+ - Used for regularisation data: No
93
+
94
+
95
+ ## Inference
96
+
97
+
98
+ ```python
99
+ import torch
100
+ from diffusers import DiffusionPipeline
101
+
102
+ model_id = 'stabilityai/stable-diffusion-3.5-medium'
103
+ adapter_id = 'bghira/sd3-controlnet-lora-test'
104
+ pipeline = DiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.bfloat16) # loading directly in bf16
105
+ pipeline.load_lora_weights(adapter_id)
106
+
107
+ prompt = "A photo-realistic image of a cat"
108
+ negative_prompt = 'ugly, cropped, blurry, low-quality, mediocre average'
109
+
110
+ ## Optional: quantise the model to save on vram.
111
+ ## Note: The model was quantised during training, and so it is recommended to do the same during inference time.
112
+ from optimum.quanto import quantize, freeze, qint8
113
+ quantize(pipeline.transformer, weights=qint8)
114
+ freeze(pipeline.transformer)
115
+
116
+ pipeline.to('cuda' if torch.cuda.is_available() else 'mps' if torch.backends.mps.is_available() else 'cpu') # the pipeline is already in its target precision level
117
+ model_output = pipeline(
118
+ prompt=prompt,
119
+ negative_prompt=negative_prompt,
120
+ num_inference_steps=16,
121
+ generator=torch.Generator(device='cuda' if torch.cuda.is_available() else 'mps' if torch.backends.mps.is_available() else 'cpu').manual_seed(42),
122
+ width=1024,
123
+ height=1024,
124
+ guidance_scale=4.0,
125
+ ).images[0]
126
+
127
+ model_output.save("output.png", format="PNG")
128
+
129
+ ```
130
+
131
+
132
+