File size: 5,044 Bytes
7a8f188
 
 
 
5ac470c
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
---
license: apache-2.0
datasets:
- huggan/smithsonian_butterflies_subset
tags:
- unconditional-image-generation
- diffusion
- ddpm
- pytorch
- diffusers
- pixel-art
---

# DDPM for 8-bit Pixel Art Wings (ddpm-pixelwing)

This repository contains a Denoising Diffusion Probabilistic Model (DDPM) trained from scratch to generate 8-bit style pixel art images of wings. This model was built using the Hugging Face [`diffusers`](https://github.com/huggingface/diffusers) library.

The model is "unconditional," meaning it generates random wing designs without any specific text or image prompt. It's a fun tool for artists, game developers, or anyone needing inspiration for pixel art sprites.

## Model Description

Denoising Diffusion Probabilistic Models (DDPMs) are a class of generative models that learn to create data by reversing a gradual noising process. The model learns to denoise an image from pure Gaussian noise, step by step, until a clean, coherent image emerges.

This specific model is based on the architecture proposed in the paper [Denoising Diffusion Probabilistic Models](https://arxiv.org/abs/2006.11239) and is implemented as a `UNet2DModel` in the `diffusers` library. It was trained on a custom dataset of 8-bit style wing images.

**Model Architecture:**
- **Class:** `UNet2DModel`
- **`sample_size`**: 32
- **`in_channels`**: 3
- **`out_channels`**: 3
- **`layers_per_block`**: 2
- **`block_out_channels`**: (64, 128, 128, 256)
- **`down_block_types`**: (`DownBlock2D`, `DownBlock2D`, `AttnDownBlock2D`, `DownBlock2D`)
- **`up_block_types`**: (`UpBlock2D`, `AttnUpBlock2D`, `UpBlock2D`, `UpBlock2D`)

## Intended Use & Limitations

### Intended Use

This model is primarily intended for creative applications, such as:
-   Generating sprites for 2D games.
-   Creating assets for digital art and design projects.
-   Providing inspiration for pixel artists.

The model can be used as-is for unconditional generation or as a base model for further fine-tuning on a more specific dataset of pixel art.

### Limitations

-   **Resolution:** The model generates images at a low resolution of **32x32 pixels**, consistent with its pixel art training data. Upscaling may be required for certain applications, which could introduce artifacts.
-   **Lack of Control:** This is an unconditional model, so you cannot direct the output with text prompts (e.g., "a fiery wing"). Generation is random.
-   **Artifacts:** Like many generative models, some outputs may contain minor visual artifacts or be less coherent than others. Running the generation process multiple times is encouraged to get a variety of high-quality results.

## How to Use

You can easily use this model for inference with just a few lines of code using the `diffusers` library.

### 1. Installation

First, make sure you have the necessary libraries installed.

```bash
pip install --upgrade diffusers transformers accelerate torch
```

### 2. Inference Pipeline

The following Python script demonstrates how to load the model from the Hugging Face Hub and generate an image.

```python
import torch
from diffusers import DDPMPipeline
from PIL import Image

# For reproducibility
generator = torch.manual_seed(42)

# Load the pretrained pipeline from the Hub
pipeline = DDPMPipeline.from_pretrained("louijiec/ddpm-pixelwing")

# If you have a GPU, move the pipeline to the GPU for faster generation
if torch.cuda.is_available():
    pipeline = pipeline.to("cuda")

print("Pipeline loaded. Starting image generation...")

# Run the generation process
# The pipeline returns a dataclass with the generated image
result = pipeline(generator=generator, num_inference_steps=1000)
image = result.images

# The output is a PIL Image, which you can display or save
print("Image generated successfully.")
image.save("pixel_wing.png")

# To generate a batch of images, you can specify `batch_size`
# images = pipeline(batch_size=4, generator=generator).images
# for i, img in enumerate(images):
#     img.save(f"pixel_wing_{i+1}.png")
```

This script will generate a 32x32 pixel art wing and save it as `pixel_wing.png` in your current directory.

## Training Details

This model was trained from scratch. The following provides an overview for those interested in the training process or looking to reproduce it.

-   **Library:** The model was trained using the official `diffusers` [unconditional image generation training script](https://github.com/huggingface/diffusers/tree/main/examples/unconditional_image_generation).

-   **Dataset:** The model was trained on a custom dataset named **"PixelWing"**, consisting of approximately 300 unique 32x32 pixel art images of wings. The images were created and curated specifically for this project.

-   **Training Procedure:**
    -   **Image Resolution:** 32x32
    -   **Epochs:** 200
    -   **Learning Rate:** 1e-4
    -   **Batch Size:** 16
    -   **Gradient Accumulation Steps:** 1
    -   **Optimizer:** AdamW
    -   **Hardware:** Trained on a single NVIDIA T4 GPU (commonly available on Google Colab).