louijiec commited on
Commit
5ac470c
·
verified ·
1 Parent(s): 7a8f188

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +114 -1
README.md CHANGED
@@ -2,4 +2,117 @@
2
  license: apache-2.0
3
  datasets:
4
  - huggan/smithsonian_butterflies_subset
5
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
2
  license: apache-2.0
3
  datasets:
4
  - huggan/smithsonian_butterflies_subset
5
+ tags:
6
+ - unconditional-image-generation
7
+ - diffusion
8
+ - ddpm
9
+ - pytorch
10
+ - diffusers
11
+ - pixel-art
12
+ ---
13
+
14
+ # DDPM for 8-bit Pixel Art Wings (ddpm-pixelwing)
15
+
16
+ This repository contains a Denoising Diffusion Probabilistic Model (DDPM) trained from scratch to generate 8-bit style pixel art images of wings. This model was built using the Hugging Face [`diffusers`](https://github.com/huggingface/diffusers) library.
17
+
18
+ The model is "unconditional," meaning it generates random wing designs without any specific text or image prompt. It's a fun tool for artists, game developers, or anyone needing inspiration for pixel art sprites.
19
+
20
+ ## Model Description
21
+
22
+ Denoising Diffusion Probabilistic Models (DDPMs) are a class of generative models that learn to create data by reversing a gradual noising process. The model learns to denoise an image from pure Gaussian noise, step by step, until a clean, coherent image emerges.
23
+
24
+ This specific model is based on the architecture proposed in the paper [Denoising Diffusion Probabilistic Models](https://arxiv.org/abs/2006.11239) and is implemented as a `UNet2DModel` in the `diffusers` library. It was trained on a custom dataset of 8-bit style wing images.
25
+
26
+ **Model Architecture:**
27
+ - **Class:** `UNet2DModel`
28
+ - **`sample_size`**: 32
29
+ - **`in_channels`**: 3
30
+ - **`out_channels`**: 3
31
+ - **`layers_per_block`**: 2
32
+ - **`block_out_channels`**: (64, 128, 128, 256)
33
+ - **`down_block_types`**: (`DownBlock2D`, `DownBlock2D`, `AttnDownBlock2D`, `DownBlock2D`)
34
+ - **`up_block_types`**: (`UpBlock2D`, `AttnUpBlock2D`, `UpBlock2D`, `UpBlock2D`)
35
+
36
+ ## Intended Use & Limitations
37
+
38
+ ### Intended Use
39
+
40
+ This model is primarily intended for creative applications, such as:
41
+ - Generating sprites for 2D games.
42
+ - Creating assets for digital art and design projects.
43
+ - Providing inspiration for pixel artists.
44
+
45
+ The model can be used as-is for unconditional generation or as a base model for further fine-tuning on a more specific dataset of pixel art.
46
+
47
+ ### Limitations
48
+
49
+ - **Resolution:** The model generates images at a low resolution of **32x32 pixels**, consistent with its pixel art training data. Upscaling may be required for certain applications, which could introduce artifacts.
50
+ - **Lack of Control:** This is an unconditional model, so you cannot direct the output with text prompts (e.g., "a fiery wing"). Generation is random.
51
+ - **Artifacts:** Like many generative models, some outputs may contain minor visual artifacts or be less coherent than others. Running the generation process multiple times is encouraged to get a variety of high-quality results.
52
+
53
+ ## How to Use
54
+
55
+ You can easily use this model for inference with just a few lines of code using the `diffusers` library.
56
+
57
+ ### 1. Installation
58
+
59
+ First, make sure you have the necessary libraries installed.
60
+
61
+ ```bash
62
+ pip install --upgrade diffusers transformers accelerate torch
63
+ ```
64
+
65
+ ### 2. Inference Pipeline
66
+
67
+ The following Python script demonstrates how to load the model from the Hugging Face Hub and generate an image.
68
+
69
+ ```python
70
+ import torch
71
+ from diffusers import DDPMPipeline
72
+ from PIL import Image
73
+
74
+ # For reproducibility
75
+ generator = torch.manual_seed(42)
76
+
77
+ # Load the pretrained pipeline from the Hub
78
+ pipeline = DDPMPipeline.from_pretrained("louijiec/ddpm-pixelwing")
79
+
80
+ # If you have a GPU, move the pipeline to the GPU for faster generation
81
+ if torch.cuda.is_available():
82
+ pipeline = pipeline.to("cuda")
83
+
84
+ print("Pipeline loaded. Starting image generation...")
85
+
86
+ # Run the generation process
87
+ # The pipeline returns a dataclass with the generated image
88
+ result = pipeline(generator=generator, num_inference_steps=1000)
89
+ image = result.images
90
+
91
+ # The output is a PIL Image, which you can display or save
92
+ print("Image generated successfully.")
93
+ image.save("pixel_wing.png")
94
+
95
+ # To generate a batch of images, you can specify `batch_size`
96
+ # images = pipeline(batch_size=4, generator=generator).images
97
+ # for i, img in enumerate(images):
98
+ # img.save(f"pixel_wing_{i+1}.png")
99
+ ```
100
+
101
+ This script will generate a 32x32 pixel art wing and save it as `pixel_wing.png` in your current directory.
102
+
103
+ ## Training Details
104
+
105
+ This model was trained from scratch. The following provides an overview for those interested in the training process or looking to reproduce it.
106
+
107
+ - **Library:** The model was trained using the official `diffusers` [unconditional image generation training script](https://github.com/huggingface/diffusers/tree/main/examples/unconditional_image_generation).
108
+
109
+ - **Dataset:** The model was trained on a custom dataset named **"PixelWing"**, consisting of approximately 300 unique 32x32 pixel art images of wings. The images were created and curated specifically for this project.
110
+
111
+ - **Training Procedure:**
112
+ - **Image Resolution:** 32x32
113
+ - **Epochs:** 200
114
+ - **Learning Rate:** 1e-4
115
+ - **Batch Size:** 16
116
+ - **Gradient Accumulation Steps:** 1
117
+ - **Optimizer:** AdamW
118
+ - **Hardware:** Trained on a single NVIDIA T4 GPU (commonly available on Google Colab).