A newer version of the Gradio SDK is available:
5.44.1
title: Ghibli Stable Diffusion Synthesis
emoji: 🎨
colorFrom: yellow
colorTo: purple
sdk: gradio
sdk_version: 5.39.0
python_version: 3.11.11
app_file: app.py
pinned: false
license: mit
Ghibli Fine-Tuned Stable Diffusion 2.1
Table of Contents
- Introduction
- Key Features
- Training Notebook
- Dataset
- Base Model
- Installation
- Usage
- Training Hyperparameters
- Metrics
- Environment
- Demonstration
- Contact
- License
- Acknowledgements
- Contributing
Introduction
The Ghibli Fine-Tuned Stable Diffusion 2.1 project is a cutting-edge endeavor that harnesses the power of deep learning to generate images in the enchanting and iconic art style of Studio Ghibli. By fine-tuning the Stable Diffusion 2.1 model, this project enables the creation of visually stunning images that capture the vibrant colors, intricate details, and whimsical charm of Ghibli films. The repository includes a meticulously crafted Jupyter notebook for training, an interactive Gradio demo for real-time image generation, and comprehensive instructions for setup and usage. Designed for data scientists, developers, and Ghibli enthusiasts, this project bridges technology and artistry with unparalleled precision.
Key Features
- Fine-Tuned Model: A Stable Diffusion 2.1 model optimized for Studio Ghibli’s art style, delivering authentic and high-quality image outputs.
- Comprehensive Training Notebook: A detailed Jupyter notebook that guides users through the fine-tuning process, compatible with multiple platforms.
- Interactive Gradio Demo: A user-friendly interface for generating Ghibli-style images, showcasing the model’s capabilities in real-time.
- Secure Data Handling: Encrypted dataset and model files using git-crypt, ensuring data integrity and controlled access.
- Cross-Platform Compatibility: Support for Google Colab, Amazon SageMaker, Deepnote, and JupyterLab, providing flexibility for all users.
Training Notebook
The cornerstone of this project is the Jupyter notebook located at notebooks/fine_tuned_sd_2_1_base-notebook.ipynb
. This notebook provides a step-by-step guide to fine-tuning the Stable Diffusion 2.1 model using the Ghibli dataset, complete with code, explanations, and best practices. It is designed to be accessible to both beginners and experienced practitioners, offering flexibility to replicate the training process or experiment with custom modifications. The notebook is compatible with the following platforms:
To get started, open the notebook in your preferred platform and follow the instructions to set up the environment and execute the training process.
Dataset
The project utilizes the Ghibli Dataset from Hugging Face, a carefully curated collection of images from Studio Ghibli films. This dataset encapsulates the unique visual style of Ghibli, featuring vibrant colors, intricate landscapes, and whimsical characters, making it ideal for fine-tuning the model.
Base Model
The fine-tuning process is built upon the Stable Diffusion 2.1 Base model by Stability AI. This robust text-to-image model provides a solid foundation, enabling high-fidelity image generation with targeted fine-tuning to achieve the Ghibli aesthetic.
Installation
To set up the project, ensure you have Python 3.11 or later installed. The following steps guide you through cloning the repository, installing dependencies, and preparing encrypted data.
Step 1: Clone the Repository
Clone the repository from GitHub:
git clone https://github.com/danhtran2mind/ghibli-fine-tuned-sd-2.1.git
cd ghibli-fine-tuned-sd-2.1
Step 2: Install Dependencies
Install the required Python libraries listed in requirements.txt
:
pip install -r requirements.txt
Step 3: Decrypt Encrypted Folders (if necessary)
The dataset
and diffusers
folders are encrypted using git-crypt for security. To decrypt them, obtain the decryption key by contacting the maintainer via the Issues tab. Then, run:
git-crypt unlock /path/to/my-repo.asc
Replace /path/to/my-repo.asc
with the path to your decryption key file. Ensure git-crypt is installed and configured for the repository.
Usage
The project supports two primary use cases: training the model using the Jupyter notebook and generating images with the Gradio demo.
Running the Training Notebook
The training notebook (notebooks/fine_tuned_sd_2_1_base-notebook.ipynb
) is the core component for fine-tuning the model. To run it:
- Open the notebook in your preferred platform (Colab, SageMaker, Deepnote, or JupyterLab).
- Follow the setup instructions within the notebook to configure the environment.
- Execute the cells sequentially to train the model or experiment with custom hyperparameters.
The notebook includes detailed comments and explanations, making it easy to understand and modify the training process.
Running the Gradio Demo
To generate Ghibli-style images using the Gradio demo, follow these steps:
Navigate to the Repository Root:
cd ghibli-fine-tuned-sd-2.1
Download the Fine-Tuned Model:
Download the model weights to the
ghibli-fine-tuned-sd-2.1
folder:cd ghibli-fine-tuned-sd-2.1 python download_model.py cd ..
The
download_model.py
script retrieves the model from the Hugging Face repository.Extract the Dataset:
Download and extract the Ghibli dataset to the
dataset
folder:cd dataset pip install datasets python extract_files.py cd ..
Extract the Diffusers Folder:
Extract the model weights or related files in the
diffusers
folder:cd diffusers python extract_files.py cd ..
The
extract_files.py
script handles the extraction process.Run the Gradio App:
Launch the Gradio demo to interact with the model:
python app.py --local_model True
The demo will be available at
localhost:7860
. Use--local_model True
for the local model orFalse
to download from Hugging Face.
Training Hyperparameters
The fine-tuning process was optimized with the following hyperparameters:
Hyperparameter | Value |
---|---|
learning_rate |
1e-05 |
num_train_epochs |
40 |
train_batch_size |
2 |
gradient_accumulation_steps |
2 |
mixed_precision |
"fp16" |
resolution |
512 |
max_grad_norm |
1 |
lr_scheduler |
"constant" |
lr_warmup_steps |
0 |
checkpoints_total_limit |
1 |
use_ema |
True |
use_8bit_adam |
True |
center_crop |
True |
random_flip |
True |
gradient_checkpointing |
True |
These parameters were carefully selected to balance training efficiency and model performance, leveraging techniques like mixed precision and gradient checkpointing.
Metrics
The fine-tuning process achieved a final loss of 0.0345, indicating excellent convergence and high fidelity to the Ghibli art style.
Environment
The project was developed and tested in the following environment:
- Python Version: 3.11.11
- Dependencies:
Library | Version |
---|---|
huggingface-hub | 0.30.2 |
accelerate | 1.3.0 |
bitsandbytes | 0.45.5 |
torch | 2.5.1 |
Pillow | 11.1.0 |
numpy | 1.26.4 |
transformers | 4.51.1 |
torchvision | 0.20.1 |
diffusers | 0.33.1 |
gradio | Latest |
Ensure your environment matches these specifications to avoid compatibility issues.
Demonstration
Explore the model’s capabilities through the interactive demo hosted at Ghibli Fine-Tuned SD 2.1. The demo allows users to generate Ghibli-style images effortlessly.
Preview Image:
Contact
For questions, issues, or to request the git-crypt decryption key, please contact the maintainer via the Issues tab on GitHub.
License
This project is licensed under the MIT License, allowing for flexible use and modification while ensuring proper attribution.
Acknowledgements
The success of this project is built upon the contributions of several key resources and communities:
- Hugging Face for providing the dataset and model hubs, enabling seamless access to high-quality resources.
- Stability AI for developing the Stable Diffusion model, a cornerstone of this project.
- The open-source community for their continuous support and contributions to the tools and libraries used.
Contributing
Contributions to this project are warmly welcomed! To contribute, please follow these steps:
- Fork the repository from GitHub.
- Create a new branch for your feature or bug fix.
- Commit your changes with clear and descriptive commit messages.
- Push your branch and submit a pull request.
For detailed guidelines, refer to the CONTRIBUTING.md file. Your contributions can help enhance the project and bring the Ghibli art style to a wider audience.