A newer version of the Gradio SDK is available:
5.45.0
Training Guide
This document provides instructions on how to run the Real-ESRGAN training script using the provided Python code. The script allows you to train a Real-ESRGAN model with customizable configurations through command-line arguments.
Prerequisites
- Python 3.10+ installed
- Real-ESRGAN repository cloned and dependencies installed
- A valid YAML configuration file for training
- Required Python packages:
yaml
,shutil
, and other dependencies listed in the Real-ESRGAN repository
Script Overview
The script (train.py
) executes the Real-ESRGAN training process by invoking the train.py
script from the Real-ESRGAN repository. It supports various command-line arguments to customize the training process and handles moving the experiment output to a specified directory.
Command-Line Arguments
The script accepts the following command-line arguments:
Argument | Type | Default | Description |
---|---|---|---|
--config |
String | configs/Real-ESRGAN-Anime-finetuning.yml |
Path to the configuration YAML file that defines training parameters. |
--launcher |
Choice (none , pytorch , slurm ) |
none |
Job launcher for distributed training. Use none for single-node training, pytorch for PyTorch distributed, or slurm for SLURM-based clusters. |
--auto_resume |
Flag | False |
If specified, automatically resumes training from the latest checkpoint. |
--debug |
Flag | False |
If specified, enables debug mode for additional logging or debugging information. |
--local_rank |
Integer | 0 |
Local rank for distributed training (used in multi-GPU setups). |
--force_yml |
List of strings | None |
Force updates to the YAML configuration file. Example: train:ema_decay=0.999 to override specific settings. |
--output_model_dir |
String | ckpts |
Path to move the experiment directory (containing model checkpoints and logs) after training completes. |
Usage Instructions
Prepare the Environment:
- Ensure the Real-ESRGAN repository is available in the
third_party/Real-ESRGAN
directory relative to the script. - Install required dependencies (refer to the Real-ESRGAN repository's documentation).
- Ensure the Real-ESRGAN repository is available in the
Create or Modify a Configuration File:
- Prepare a YAML configuration file (e.g.,
Real-ESRGAN-Anime-finetuning.yml
) specifying training parameters like dataset paths, model architecture, and hyperparameters. - The configuration file must include a
name
field to identify the experiment.
- Prepare a YAML configuration file (e.g.,
Run the Training Script: Use the following command to start training with default settings:
python train.py
To customize the training, use the command-line arguments. Examples:
- Train with a specific configuration file and enable auto-resume:
python train.py --config configs/my_config.yml --auto_resume
- Override YAML settings and specify an output directory:
python train.py --config configs/my_config.yml \ --output_model_dir "</path/to/output>"
- Train with a specific configuration file and enable auto-resume:
Output:
- The script runs the Real-ESRGAN training process using the specified configuration.
- After training, the experiment directory (named after the
name
field in the YAML file) is moved fromthird_party/Real-ESRGAN/experiments/
to the directory specified by--output_model_dir
. - If the source experiment directory does not exist, a warning is printed.
- Errors during training or directory moving are caught and reported, with the script exiting on failure.
Example Workflow
To train a Real-ESRGAN model for anime-style image upscaling:
- Ensure the
Real-ESRGAN-Anime-finetuning.yml
file is configured with the correct dataset paths and model settings. - Run the following command:
python train.py --config configs/Real-ESRGAN-Anime-finetuning.yml --output_model_dir models/anime_model --auto_resume
- The training process will start, and upon completion, the experiment directory will be moved to
models/anime_model
.
Notes
- Ensure the YAML configuration file exists at the specified path, or the script will raise a
FileNotFoundError
. - The
PYTHONPATH
environment variable is modified to include the Real-ESRGAN directory for proper module resolution. - If using distributed training (
--launcher pytorch
or--launcher slurm
), ensure the environment is set up for multi-GPU or cluster-based training. - The
--force_yml
argument allows dynamic updates to the YAML configuration without modifying the file directly.
Troubleshooting
- Error: Configuration file not found:
- Verify the
--config
path is correct and the file exists.
- Verify the
- Error: Training failed:
- Check the Real-ESRGAN repository's documentation for troubleshooting training issues.
- Ensure all dependencies are installed and compatible.
- Warning: Source directory does not exist:
- Confirm that the experiment name in the YAML file matches the expected directory in
third_party/Real-ESRGAN/experiments/
.
- Confirm that the experiment name in the YAML file matches the expected directory in
- Permission errors when moving directories:
- Ensure the script has write permissions for the
--output_model_dir
path.
- Ensure the script has write permissions for the
For further details on Real-ESRGAN, refer to the official repository documentation.