Spaces:

LTT
/

Kiss3DGen

Runtime error

App Files Files Community

Kiss3DGen / custom_diffusers /examples /consistency_distillation /README.md

JiantaoLin

new

10bcbc8 10 months ago

preview code

raw

history blame

4.45 kB

	# Latent Consistency Distillation Example:

	[Latent Consistency Models (LCMs)](https://arxiv.org/abs/2310.04378) is a method to distill a latent diffusion model to enable swift inference with minimal steps. This example demonstrates how to use latent consistency distillation to distill stable-diffusion-v1.5 for inference with few timesteps.

	## Full model distillation

	### Running locally with PyTorch

	#### Installing the dependencies

	Before running the scripts, make sure to install the library's training dependencies:

	Important

	To make sure you can successfully run the latest versions of the example scripts, we highly recommend installing from source and keeping the install up to date as we update the example scripts frequently and install some example-specific requirements. To do this, execute the following steps in a new virtual environment:
	```bash
	git clone https://github.com/huggingface/diffusers
	cd diffusers
	pip install -e .
	```

	Then cd in the example folder and run
	```bash
	pip install -r requirements.txt
	```

	And initialize an [🤗 Accelerate](https://github.com/huggingface/accelerate/) environment with:

	```bash
	accelerate config
	```

	Or for a default accelerate configuration without answering questions about your environment

	```bash
	accelerate config default
	```

	Or if your environment doesn't support an interactive shell e.g. a notebook

	```python
	from accelerate.utils import write_basic_config
	write_basic_config()
	```

	When running `accelerate config`, if we specify torch compile mode to True there can be dramatic speedups.


	#### Example

	The following uses the [Conceptual Captions 12M (CC12M) dataset](https://github.com/google-research-datasets/conceptual-12m) as an example, and for illustrative purposes only. For best results you may consider large and high-quality text-image datasets such as [LAION](https://laion.ai/blog/laion-400-open-dataset/). You may also need to search the hyperparameter space according to the dataset you use.

	```bash
	export MODEL_NAME="stable-diffusion-v1-5/stable-diffusion-v1-5"
	export OUTPUT_DIR="path/to/saved/model"

	accelerate launch train_lcm_distill_sd_wds.py \
	--pretrained_teacher_model=$MODEL_NAME \
	--output_dir=$OUTPUT_DIR \
	--mixed_precision=fp16 \
	--resolution=512 \
	--learning_rate=1e-6 --loss_type="huber" --ema_decay=0.95 --adam_weight_decay=0.0 \
	--max_train_steps=1000 \
	--max_train_samples=4000000 \
	--dataloader_num_workers=8 \
	--train_shards_path_or_url="pipe:curl -L -s https://huggingface.co/datasets/laion/conceptual-captions-12m-webdataset/resolve/main/data/{00000..01099}.tar?download=true" \
	--validation_steps=200 \
	--checkpointing_steps=200 --checkpoints_total_limit=10 \
	--train_batch_size=12 \
	--gradient_checkpointing --enable_xformers_memory_efficient_attention \
	--gradient_accumulation_steps=1 \
	--use_8bit_adam \
	--resume_from_checkpoint=latest \
	--report_to=wandb \
	--seed=453645634 \
	--push_to_hub
	```

	## LCM-LoRA

	Instead of fine-tuning the full model, we can also just train a LoRA that can be injected into any SDXL model.

	### Example

	The following uses the [Conceptual Captions 12M (CC12M) dataset](https://github.com/google-research-datasets/conceptual-12m) as an example. For best results you may consider large and high-quality text-image datasets such as [LAION](https://laion.ai/blog/laion-400-open-dataset/).

	```bash
	export MODEL_NAME="stable-diffusion-v1-5/stable-diffusion-v1-5"
	export OUTPUT_DIR="path/to/saved/model"

	accelerate launch train_lcm_distill_lora_sd_wds.py \
	--pretrained_teacher_model=$MODEL_NAME \
	--output_dir=$OUTPUT_DIR \
	--mixed_precision=fp16 \
	--resolution=512 \
	--lora_rank=64 \
	--learning_rate=1e-4 --loss_type="huber" --adam_weight_decay=0.0 \
	--max_train_steps=1000 \
	--max_train_samples=4000000 \
	--dataloader_num_workers=8 \
	--train_shards_path_or_url="pipe:curl -L -s https://huggingface.co/datasets/laion/conceptual-captions-12m-webdataset/resolve/main/data/{00000..01099}.tar?download=true" \
	--validation_steps=200 \
	--checkpointing_steps=200 --checkpoints_total_limit=10 \
	--train_batch_size=12 \
	--gradient_checkpointing --enable_xformers_memory_efficient_attention \
	--gradient_accumulation_steps=1 \
	--use_8bit_adam \
	--resume_from_checkpoint=latest \
	--report_to=wandb \
	--seed=453645634 \
	--push_to_hub \
	```