HiDream_HQ-models

Guide (External Site): English | Japanese

This repository collects high-quality models for generating the best images using HiDream.

Left: HiDream-I1-Dev | Right: i2i from SDXL Anime Illustration

Model Details

Transformers (HiDream-I1-Full, Dev, Fast, E1-Full, E1-1): 17 billion parameter transformer-based models using the Rectified Flow Transformer architecture, optimized for photorealistic, cartoon, and artistic image generation.
Text Encoders: Process input text prompts to guide image generation. Includes Llama-3.1, flan-t5-xxl, and CLIP models.
VAE: The Variational Autoencoder from FLUX.1 [schnell] for encoding and decoding images.

Model List

Model Name	License	Type	Commercial Use
HiDream-I1-Full-FP16	MIT	Transformer	✅
HiDream-I1-Dev-BF16	MIT	Transformer	✅
HiDream-I1-Fast-BF16	MIT	Transformer	✅
HiDream-E1-Full-BF16	MIT	Transformer	✅
HiDream-E1-1-BF16	MIT	Transformer	✅
Llama-3.1-8B-Instruct-BF16	Llama 3.1 Community License	Text encoder	✅
flan-t5-xxl_TE-only_FP32	Apache 2.0	Text encoder	✅
CLIP-ViT-bigG-14-laion2B-39B-b160k-FP32	MIT	Text encoder	✅
CLIP-SAE-ViT-L-14-FP32	MIT	Text encoder	✅
FLUX1-schnell-AE-FP32	Apache 2.0	VAE	✅

The Llama-3.1-8B-Instruct-BF16 model is a recombined version of the split-distributed original Llama-3.1-8B-Instruct provided by Meta, adapted for use in HiDream pipelines.

Light-weight models

If you are looking for a lightweight model, please refer to the following links.

HiDream Models

Text Encoder

FP8 Text Encoder

Usage Instructions

Place the downloaded model files in the following directories.

To use FP32 format text encoders, enable the --fp32-text-enc setting at ComfyUI startup.

Transformers

Models
- HiDream-I1-Full-FP16
- HiDream-I1-Dev-BF16
- HiDream-I1-Fast-BF16
- HiDream-E1-Full-BF16
- HiDream-E1-1-BF16
Folder
- models/StableDiffusion

Text encoders

Models
- Llama-3.1-8B-Instruct-BF16
- flan-t5-xxl_TE-only_FP32
- CLIP-ViT-bigG-14-laion2B-39B-b160k-FP32
- CLIP-SAE-ViT-L-14-FP32
Folder
- models/text_encoder

VAE

Models
- FLUX1-schnell-AE-FP32
Folder
- models/vae

ComfyUI Workflow

The FP16/BF16 models of HiDream-I1 require a minimum split capacity of 10.1 GB for the transformer in ComfyUI.

With 12GB VRAM and 64GB RAM, you can achieve the highest image quality using the following workflows.

HiDream-I1-Full

HiDream-I1-Dev

HiDream-I1-Fast

HiDream-E1-Full

HiDream-E1-1

CLIP Loader

For loading text encoders, use the QuadrupleCLIPLoaderMultiGPU custom node from ComfyUI-MultiGPU. Explicitly specify device: cpu to load the text encoder into system RAM.

Sampler and Scheduler

In my environment, using the heun and heunpp2 samplers with the beta scheduler allowed for generating high-quality images.

Additionally, due to the high precision of the text encoder and settings, it was possible to reduce the number of steps for faster processing.

Configuration Step Count

	Official Recommended	Sample Workflow
HiDream-I1-Full-FP16	50	20
HiDream-I1-Dev-BF16	28	12
HiDream-I1-Fast-BF16	16	6
HiDream-E1-Full-BF16	28	12
HiDream-E1-1-BF16	28	12

License

This repository inherits the licenses of the original models:

MIT License: Include the license text in your distribution.
Apache 2.0 License: Include the license and provide notices.
Llama 3.1 Community License: Include the statement "Built with Meta Llama 3" and adhere to the usage policy.

See the Model List for links to the full license texts.