louijiec
/

ddpm-pixelwing

 license: apache-2.0
 datasets:
 - huggan/smithsonian_butterflies_subset
+tags:
+- unconditional-image-generation
+- diffusion
+- ddpm
+- pytorch
+- diffusers
+- pixel-art
+---
+# DDPM for 8-bit Pixel Art Wings (ddpm-pixelwing)
+This repository contains a Denoising Diffusion Probabilistic Model (DDPM) trained from scratch to generate 8-bit style pixel art images of wings. This model was built using the Hugging Face [`diffusers`](https://github.com/huggingface/diffusers) library.
+The model is "unconditional," meaning it generates random wing designs without any specific text or image prompt. It's a fun tool for artists, game developers, or anyone needing inspiration for pixel art sprites.
+## Model Description
+Denoising Diffusion Probabilistic Models (DDPMs) are a class of generative models that learn to create data by reversing a gradual noising process. The model learns to denoise an image from pure Gaussian noise, step by step, until a clean, coherent image emerges.
+This specific model is based on the architecture proposed in the paper [Denoising Diffusion Probabilistic Models](https://arxiv.org/abs/2006.11239) and is implemented as a `UNet2DModel` in the `diffusers` library. It was trained on a custom dataset of 8-bit style wing images.
+**Model Architecture:**
+- **Class:** `UNet2DModel`
+- **`sample_size`**: 32
+- **`in_channels`**: 3
+- **`out_channels`**: 3
+- **`layers_per_block`**: 2
+- **`block_out_channels`**: (64, 128, 128, 256)
+- **`down_block_types`**: (`DownBlock2D`, `DownBlock2D`, `AttnDownBlock2D`, `DownBlock2D`)
+- **`up_block_types`**: (`UpBlock2D`, `AttnUpBlock2D`, `UpBlock2D`, `UpBlock2D`)
+## Intended Use & Limitations
+### Intended Use
+This model is primarily intended for creative applications, such as:
+-   Generating sprites for 2D games.
+-   Creating assets for digital art and design projects.
+-   Providing inspiration for pixel artists.
+The model can be used as-is for unconditional generation or as a base model for further fine-tuning on a more specific dataset of pixel art.
+### Limitations
+-   **Resolution:** The model generates images at a low resolution of **32x32 pixels**, consistent with its pixel art training data. Upscaling may be required for certain applications, which could introduce artifacts.
+-   **Lack of Control:** This is an unconditional model, so you cannot direct the output with text prompts (e.g., "a fiery wing"). Generation is random.
+-   **Artifacts:** Like many generative models, some outputs may contain minor visual artifacts or be less coherent than others. Running the generation process multiple times is encouraged to get a variety of high-quality results.
+## How to Use
+You can easily use this model for inference with just a few lines of code using the `diffusers` library.
+### 1. Installation
+First, make sure you have the necessary libraries installed.
+```bash
+pip install --upgrade diffusers transformers accelerate torch
+```
+### 2. Inference Pipeline
+The following Python script demonstrates how to load the model from the Hugging Face Hub and generate an image.
+```python
+import torch
+from diffusers import DDPMPipeline
+from PIL import Image
+# For reproducibility
+generator = torch.manual_seed(42)
+# Load the pretrained pipeline from the Hub
+pipeline = DDPMPipeline.from_pretrained("louijiec/ddpm-pixelwing")
+# If you have a GPU, move the pipeline to the GPU for faster generation
+if torch.cuda.is_available():
+    pipeline = pipeline.to("cuda")
+print("Pipeline loaded. Starting image generation...")
+# Run the generation process
+# The pipeline returns a dataclass with the generated image
+result = pipeline(generator=generator, num_inference_steps=1000)
+image = result.images
+# The output is a PIL Image, which you can display or save
+print("Image generated successfully.")
+image.save("pixel_wing.png")
+# To generate a batch of images, you can specify `batch_size`
+# images = pipeline(batch_size=4, generator=generator).images
+# for i, img in enumerate(images):
+#     img.save(f"pixel_wing_{i+1}.png")
+```
+This script will generate a 32x32 pixel art wing and save it as `pixel_wing.png` in your current directory.
+## Training Details
+This model was trained from scratch. The following provides an overview for those interested in the training process or looking to reproduce it.
+-   **Library:** The model was trained using the official `diffusers` [unconditional image generation training script](https://github.com/huggingface/diffusers/tree/main/examples/unconditional_image_generation).
+-   **Dataset:** The model was trained on a custom dataset named **"PixelWing"**, consisting of approximately 300 unique 32x32 pixel art images of wings. The images were created and curated specifically for this project.
+-   **Training Procedure:**
+    -   **Image Resolution:** 32x32
+    -   **Epochs:** 200
+    -   **Learning Rate:** 1e-4
+    -   **Batch Size:** 16
+    -   **Gradient Accumulation Steps:** 1
+    -   **Optimizer:** AdamW
+    -   **Hardware:** Trained on a single NVIDIA T4 GPU (commonly available on Google Colab).