Update README.md
Browse files
README.md
CHANGED
@@ -2,4 +2,117 @@
|
|
2 |
license: apache-2.0
|
3 |
datasets:
|
4 |
- huggan/smithsonian_butterflies_subset
|
5 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
2 |
license: apache-2.0
|
3 |
datasets:
|
4 |
- huggan/smithsonian_butterflies_subset
|
5 |
+
tags:
|
6 |
+
- unconditional-image-generation
|
7 |
+
- diffusion
|
8 |
+
- ddpm
|
9 |
+
- pytorch
|
10 |
+
- diffusers
|
11 |
+
- pixel-art
|
12 |
+
---
|
13 |
+
|
14 |
+
# DDPM for 8-bit Pixel Art Wings (ddpm-pixelwing)
|
15 |
+
|
16 |
+
This repository contains a Denoising Diffusion Probabilistic Model (DDPM) trained from scratch to generate 8-bit style pixel art images of wings. This model was built using the Hugging Face [`diffusers`](https://github.com/huggingface/diffusers) library.
|
17 |
+
|
18 |
+
The model is "unconditional," meaning it generates random wing designs without any specific text or image prompt. It's a fun tool for artists, game developers, or anyone needing inspiration for pixel art sprites.
|
19 |
+
|
20 |
+
## Model Description
|
21 |
+
|
22 |
+
Denoising Diffusion Probabilistic Models (DDPMs) are a class of generative models that learn to create data by reversing a gradual noising process. The model learns to denoise an image from pure Gaussian noise, step by step, until a clean, coherent image emerges.
|
23 |
+
|
24 |
+
This specific model is based on the architecture proposed in the paper [Denoising Diffusion Probabilistic Models](https://arxiv.org/abs/2006.11239) and is implemented as a `UNet2DModel` in the `diffusers` library. It was trained on a custom dataset of 8-bit style wing images.
|
25 |
+
|
26 |
+
**Model Architecture:**
|
27 |
+
- **Class:** `UNet2DModel`
|
28 |
+
- **`sample_size`**: 32
|
29 |
+
- **`in_channels`**: 3
|
30 |
+
- **`out_channels`**: 3
|
31 |
+
- **`layers_per_block`**: 2
|
32 |
+
- **`block_out_channels`**: (64, 128, 128, 256)
|
33 |
+
- **`down_block_types`**: (`DownBlock2D`, `DownBlock2D`, `AttnDownBlock2D`, `DownBlock2D`)
|
34 |
+
- **`up_block_types`**: (`UpBlock2D`, `AttnUpBlock2D`, `UpBlock2D`, `UpBlock2D`)
|
35 |
+
|
36 |
+
## Intended Use & Limitations
|
37 |
+
|
38 |
+
### Intended Use
|
39 |
+
|
40 |
+
This model is primarily intended for creative applications, such as:
|
41 |
+
- Generating sprites for 2D games.
|
42 |
+
- Creating assets for digital art and design projects.
|
43 |
+
- Providing inspiration for pixel artists.
|
44 |
+
|
45 |
+
The model can be used as-is for unconditional generation or as a base model for further fine-tuning on a more specific dataset of pixel art.
|
46 |
+
|
47 |
+
### Limitations
|
48 |
+
|
49 |
+
- **Resolution:** The model generates images at a low resolution of **32x32 pixels**, consistent with its pixel art training data. Upscaling may be required for certain applications, which could introduce artifacts.
|
50 |
+
- **Lack of Control:** This is an unconditional model, so you cannot direct the output with text prompts (e.g., "a fiery wing"). Generation is random.
|
51 |
+
- **Artifacts:** Like many generative models, some outputs may contain minor visual artifacts or be less coherent than others. Running the generation process multiple times is encouraged to get a variety of high-quality results.
|
52 |
+
|
53 |
+
## How to Use
|
54 |
+
|
55 |
+
You can easily use this model for inference with just a few lines of code using the `diffusers` library.
|
56 |
+
|
57 |
+
### 1. Installation
|
58 |
+
|
59 |
+
First, make sure you have the necessary libraries installed.
|
60 |
+
|
61 |
+
```bash
|
62 |
+
pip install --upgrade diffusers transformers accelerate torch
|
63 |
+
```
|
64 |
+
|
65 |
+
### 2. Inference Pipeline
|
66 |
+
|
67 |
+
The following Python script demonstrates how to load the model from the Hugging Face Hub and generate an image.
|
68 |
+
|
69 |
+
```python
|
70 |
+
import torch
|
71 |
+
from diffusers import DDPMPipeline
|
72 |
+
from PIL import Image
|
73 |
+
|
74 |
+
# For reproducibility
|
75 |
+
generator = torch.manual_seed(42)
|
76 |
+
|
77 |
+
# Load the pretrained pipeline from the Hub
|
78 |
+
pipeline = DDPMPipeline.from_pretrained("louijiec/ddpm-pixelwing")
|
79 |
+
|
80 |
+
# If you have a GPU, move the pipeline to the GPU for faster generation
|
81 |
+
if torch.cuda.is_available():
|
82 |
+
pipeline = pipeline.to("cuda")
|
83 |
+
|
84 |
+
print("Pipeline loaded. Starting image generation...")
|
85 |
+
|
86 |
+
# Run the generation process
|
87 |
+
# The pipeline returns a dataclass with the generated image
|
88 |
+
result = pipeline(generator=generator, num_inference_steps=1000)
|
89 |
+
image = result.images
|
90 |
+
|
91 |
+
# The output is a PIL Image, which you can display or save
|
92 |
+
print("Image generated successfully.")
|
93 |
+
image.save("pixel_wing.png")
|
94 |
+
|
95 |
+
# To generate a batch of images, you can specify `batch_size`
|
96 |
+
# images = pipeline(batch_size=4, generator=generator).images
|
97 |
+
# for i, img in enumerate(images):
|
98 |
+
# img.save(f"pixel_wing_{i+1}.png")
|
99 |
+
```
|
100 |
+
|
101 |
+
This script will generate a 32x32 pixel art wing and save it as `pixel_wing.png` in your current directory.
|
102 |
+
|
103 |
+
## Training Details
|
104 |
+
|
105 |
+
This model was trained from scratch. The following provides an overview for those interested in the training process or looking to reproduce it.
|
106 |
+
|
107 |
+
- **Library:** The model was trained using the official `diffusers` [unconditional image generation training script](https://github.com/huggingface/diffusers/tree/main/examples/unconditional_image_generation).
|
108 |
+
|
109 |
+
- **Dataset:** The model was trained on a custom dataset named **"PixelWing"**, consisting of approximately 300 unique 32x32 pixel art images of wings. The images were created and curated specifically for this project.
|
110 |
+
|
111 |
+
- **Training Procedure:**
|
112 |
+
- **Image Resolution:** 32x32
|
113 |
+
- **Epochs:** 200
|
114 |
+
- **Learning Rate:** 1e-4
|
115 |
+
- **Batch Size:** 16
|
116 |
+
- **Gradient Accumulation Steps:** 1
|
117 |
+
- **Optimizer:** AdamW
|
118 |
+
- **Hardware:** Trained on a single NVIDIA T4 GPU (commonly available on Google Colab).
|