Spaces:

danhtran2mind
/

Anime-Super-Resolution

Running

App Files Files Community

Anime-Super-Resolution / docs /training /training_doc.md

danhtran2mind

Upload 82 files

5e1b2e8 verified about 1 month ago

preview code

raw

history blame contribute delete

5.27 kB

	# Training Guide

	This document provides instructions on how to run the Real-ESRGAN training script using the provided Python code. The script allows you to train a Real-ESRGAN model with customizable configurations through command-line arguments.

	## Prerequisites
	- Python 3.10+ installed
	- Real-ESRGAN repository cloned and dependencies installed
	- A valid YAML configuration file for training
	- Required Python packages: `yaml`, `shutil`, and other dependencies listed in the Real-ESRGAN repository

	## Script Overview
	The script (`train.py`) executes the Real-ESRGAN training process by invoking the `train.py` script from the Real-ESRGAN repository. It supports various command-line arguments to customize the training process and handles moving the experiment output to a specified directory.

	## Command-Line Arguments
	The script accepts the following command-line arguments:

	\| Argument \| Type \| Default \| Description \|
	\|----------\|------\|---------\|-------------\|
	\| `--config` \| String \| `configs/Real-ESRGAN-Anime-finetuning.yml` \| Path to the configuration YAML file that defines training parameters. \|
	\| `--launcher` \| Choice (`none`, `pytorch`, `slurm`) \| `none` \| Job launcher for distributed training. Use `none` for single-node training, `pytorch` for PyTorch distributed, or `slurm` for SLURM-based clusters. \|
	\| `--auto_resume` \| Flag \| `False` \| If specified, automatically resumes training from the latest checkpoint. \|
	\| `--debug` \| Flag \| `False` \| If specified, enables debug mode for additional logging or debugging information. \|
	\| `--local_rank` \| Integer \| `0` \| Local rank for distributed training (used in multi-GPU setups). \|
	\| `--force_yml` \| List of strings \| `None` \| Force updates to the YAML configuration file. Example: `train:ema_decay=0.999` to override specific settings. \|
	\| `--output_model_dir` \| String \| `ckpts` \| Path to move the experiment directory (containing model checkpoints and logs) after training completes. \|

	## Usage Instructions
	1. Prepare the Environment:
	- Ensure the Real-ESRGAN repository is available in the `third_party/Real-ESRGAN` directory relative to the script.
	- Install required dependencies (refer to the Real-ESRGAN repository's documentation).

	2. Create or Modify a Configuration File:
	- Prepare a YAML configuration file (e.g., `Real-ESRGAN-Anime-finetuning.yml`) specifying training parameters like dataset paths, model architecture, and hyperparameters.
	- The configuration file must include a `name` field to identify the experiment.

	3. Run the Training Script:
	Use the following command to start training with default settings:
	```bash
	python train.py
	```

	To customize the training, use the command-line arguments. Examples:
	- Train with a specific configuration file and enable auto-resume:
	```bash
	python train.py --config configs/my_config.yml --auto_resume
	```
	- Override YAML settings and specify an output directory:
	```bash
	python train.py --config configs/my_config.yml \
	--output_model_dir "</path/to/output>"
	```

	4. Output:
	- The script runs the Real-ESRGAN training process using the specified configuration.
	- After training, the experiment directory (named after the `name` field in the YAML file) is moved from `third_party/Real-ESRGAN/experiments/` to the directory specified by `--output_model_dir`.
	- If the source experiment directory does not exist, a warning is printed.
	- Errors during training or directory moving are caught and reported, with the script exiting on failure.

	## Example Workflow
	To train a Real-ESRGAN model for anime-style image upscaling:
	1. Ensure the `Real-ESRGAN-Anime-finetuning.yml` file is configured with the correct dataset paths and model settings.
	2. Run the following command:
	```bash
	python train.py --config configs/Real-ESRGAN-Anime-finetuning.yml --output_model_dir models/anime_model --auto_resume
	```
	3. The training process will start, and upon completion, the experiment directory will be moved to `models/anime_model`.

	## Notes
	- Ensure the YAML configuration file exists at the specified path, or the script will raise a `FileNotFoundError`.
	- The `PYTHONPATH` environment variable is modified to include the Real-ESRGAN directory for proper module resolution.
	- If using distributed training (`--launcher pytorch` or `--launcher slurm`), ensure the environment is set up for multi-GPU or cluster-based training.
	- The `--force_yml` argument allows dynamic updates to the YAML configuration without modifying the file directly.

	## Troubleshooting
	- Error: Configuration file not found:
	- Verify the `--config` path is correct and the file exists.
	- Error: Training failed:
	- Check the Real-ESRGAN repository's documentation for troubleshooting training issues.
	- Ensure all dependencies are installed and compatible.
	- Warning: Source directory does not exist:
	- Confirm that the experiment name in the YAML file matches the expected directory in `third_party/Real-ESRGAN/experiments/`.
	- Permission errors when moving directories:
	- Ensure the script has write permissions for the `--output_model_dir` path.

	For further details on Real-ESRGAN, refer to the official repository documentation.