Spaces:
fantaxy
/
Running on Zero

flx-pulid / README.md
fantaxy's picture
Update README.md
b81b62a verified

A newer version of the Gradio SDK is available: 5.42.0

Upgrade
metadata
title: PuLID based FLUX FaceID
emoji: ๐Ÿค—
colorFrom: blue
colorTo: indigo
sdk: gradio
sdk_version: 5.35.0
app_file: app.py
pinned: false
license: apache-2.0

PuLID for FLUX: Portrait-Guided Image Generation

This code implements PuLID (Pure and Lightning ID customization) for FLUX.1-dev, an advanced image generation system that allows users to create personalized images using ID (identity) images as guidance. The system combines the power of FLUX diffusion models with identity preservation capabilities.

Key Features

1. Identity-Guided Generation

  • Upload an ID image (portrait photo) to guide the generation process
  • Control identity strength with adjustable ID weight (0.0-3.0)
  • Preserve facial features while applying various artistic styles

2. Advanced Configuration Options

  • Resolution Control: Adjustable width (256-1536px) and height (256-1536px)
  • Generation Steps: 1-20 steps for quality vs speed tradeoff
  • Guidance Scale: Fine-tune adherence to prompts (1.0-10.0)
  • Seed Control: Reproducible results with manual seed input

3. True CFG (Classifier-Free Guidance)

  • Fake CFG mode (scale=1): Faster generation with basic guidance
  • True CFG mode (scale>1): Enhanced quality with negative prompt support
  • Configurable timestep for CFG activation

4. Technical Architecture

  • Built on FLUX.1-dev diffusion model
  • Utilizes T5 text encoder for prompt understanding
  • CLIP model for image-text alignment
  • Autoencoder for latent space operations
  • GPU acceleration with CUDA support

How It Works

  1. Text Prompt Input: Describe the desired image style (e.g., "portrait, pixar")
  2. ID Image Upload: Provide a reference portrait for identity guidance
  3. Parameter Tuning: Adjust generation settings for optimal results
  4. Image Generation: The model creates an image matching the prompt while preserving the identity

Example Use Cases

  • Transform portraits into different artistic styles (ice sculpture, pixar animation)
  • Create personalized avatars maintaining facial identity
  • Generate creative variations of portraits with text prompts
  • Produce consistent character designs across different scenarios

The system leverages Gradio for an intuitive web interface, making advanced AI image generation accessible to users without technical expertise.


PuLID for FLUX: ์ธ๋ฌผ ๊ธฐ๋ฐ˜ ์ด๋ฏธ์ง€ ์ƒ์„ฑ ์‹œ์Šคํ…œ

์ด ์ฝ”๋“œ๋Š” FLUX.1-dev๋ฅผ ์œ„ํ•œ PuLID (Pure and Lightning ID customization) ์‹œ์Šคํ…œ์„ ๊ตฌํ˜„ํ•œ ๊ฒƒ์œผ๋กœ, ID(์‹ ์›) ์ด๋ฏธ์ง€๋ฅผ ๊ฐ€์ด๋“œ๋กœ ์‚ฌ์šฉํ•˜์—ฌ ๊ฐœ์ธํ™”๋œ ์ด๋ฏธ์ง€๋ฅผ ์ƒ์„ฑํ•  ์ˆ˜ ์žˆ๋Š” ๊ณ ๊ธ‰ ์ด๋ฏธ์ง€ ์ƒ์„ฑ ์‹œ์Šคํ…œ์ž…๋‹ˆ๋‹ค. FLUX ํ™•์‚ฐ ๋ชจ๋ธ์˜ ๊ฐ•๋ ฅํ•œ ์„ฑ๋Šฅ๊ณผ ์‹ ์› ๋ณด์กด ๊ธฐ๋Šฅ์„ ๊ฒฐํ•ฉํ•˜์˜€์Šต๋‹ˆ๋‹ค.

์ฃผ์š” ๊ธฐ๋Šฅ

1. ์‹ ์› ๊ธฐ๋ฐ˜ ์ด๋ฏธ์ง€ ์ƒ์„ฑ

  • ID ์ด๋ฏธ์ง€(์ธ๋ฌผ ์‚ฌ์ง„)๋ฅผ ์—…๋กœ๋“œํ•˜์—ฌ ์ƒ์„ฑ ๊ณผ์ • ๊ฐ€์ด๋“œ
  • ์กฐ์ ˆ ๊ฐ€๋Šฅํ•œ ID ๊ฐ€์ค‘์น˜(0.0-3.0)๋กœ ์‹ ์› ๊ฐ•๋„ ์ œ์–ด
  • ๋‹ค์–‘ํ•œ ์˜ˆ์ˆ ์  ์Šคํƒ€์ผ์„ ์ ์šฉํ•˜๋ฉด์„œ๋„ ์–ผ๊ตด ํŠน์ง• ๋ณด์กด

2. ๊ณ ๊ธ‰ ์„ค์ • ์˜ต์…˜

  • ํ•ด์ƒ๋„ ์ œ์–ด: ๋„ˆ๋น„(256-1536px)์™€ ๋†’์ด(256-1536px) ์กฐ์ ˆ ๊ฐ€๋Šฅ
  • ์ƒ์„ฑ ๋‹จ๊ณ„: ํ’ˆ์งˆ ๋Œ€ ์†๋„ ๊ท ํ˜•์„ ์œ„ํ•œ 1-20๋‹จ๊ณ„ ์„ค์ •
  • ๊ฐ€์ด๋˜์Šค ์Šค์ผ€์ผ: ํ”„๋กฌํ”„ํŠธ ์ค€์ˆ˜๋„ ๋ฏธ์„ธ ์กฐ์ •(1.0-10.0)
  • ์‹œ๋“œ ์ œ์–ด: ์ˆ˜๋™ ์‹œ๋“œ ์ž…๋ ฅ์œผ๋กœ ์žฌํ˜„ ๊ฐ€๋Šฅํ•œ ๊ฒฐ๊ณผ ์ƒ์„ฑ

3. True CFG (Classifier-Free Guidance)

  • Fake CFG ๋ชจ๋“œ(scale=1): ๊ธฐ๋ณธ ๊ฐ€์ด๋˜์Šค๋กœ ๋น ๋ฅธ ์ƒ์„ฑ
  • True CFG ๋ชจ๋“œ(scale>1): ๋ถ€์ • ํ”„๋กฌํ”„ํŠธ ์ง€์›์œผ๋กœ ํ–ฅ์ƒ๋œ ํ’ˆ์งˆ
  • CFG ํ™œ์„ฑํ™” ์‹œ์  ์„ค์ • ๊ฐ€๋Šฅ

4. ๊ธฐ์ˆ ์  ๊ตฌ์กฐ

  • FLUX.1-dev ํ™•์‚ฐ ๋ชจ๋ธ ๊ธฐ๋ฐ˜
  • T5 ํ…์ŠคํŠธ ์ธ์ฝ”๋”๋กœ ํ”„๋กฌํ”„ํŠธ ์ดํ•ด
  • CLIP ๋ชจ๋ธ๋กœ ์ด๋ฏธ์ง€-ํ…์ŠคํŠธ ์ •๋ ฌ
  • ์ž ์žฌ ๊ณต๊ฐ„ ์ž‘์—…์„ ์œ„ํ•œ ์˜คํ† ์ธ์ฝ”๋”
  • CUDA ์ง€์› GPU ๊ฐ€์†

์ž‘๋™ ๋ฐฉ์‹

  1. ํ…์ŠคํŠธ ํ”„๋กฌํ”„ํŠธ ์ž…๋ ฅ: ์›ํ•˜๋Š” ์ด๋ฏธ์ง€ ์Šคํƒ€์ผ ์„ค๋ช… (์˜ˆ: "portrait, pixar")
  2. ID ์ด๋ฏธ์ง€ ์—…๋กœ๋“œ: ์‹ ์› ๊ฐ€์ด๋“œ๋ฅผ ์œ„ํ•œ ์ฐธ์กฐ ์ธ๋ฌผ ์‚ฌ์ง„ ์ œ๊ณต
  3. ๋งค๊ฐœ๋ณ€์ˆ˜ ์กฐ์ •: ์ตœ์ ์˜ ๊ฒฐ๊ณผ๋ฅผ ์œ„ํ•œ ์ƒ์„ฑ ์„ค์ • ์กฐ์ ˆ
  4. ์ด๋ฏธ์ง€ ์ƒ์„ฑ: ๋ชจ๋ธ์ด ์‹ ์›์„ ๋ณด์กดํ•˜๋ฉด์„œ ํ”„๋กฌํ”„ํŠธ์— ๋งž๋Š” ์ด๋ฏธ์ง€ ์ƒ์„ฑ

ํ™œ์šฉ ์˜ˆ์‹œ

  • ์ธ๋ฌผ ์‚ฌ์ง„์„ ๋‹ค์–‘ํ•œ ์˜ˆ์ˆ  ์Šคํƒ€์ผ๋กœ ๋ณ€ํ™˜ (์–ผ์Œ ์กฐ๊ฐ, ํ”ฝ์‚ฌ ์• ๋‹ˆ๋ฉ”์ด์…˜)
  • ์–ผ๊ตด ์‹ ์›์„ ์œ ์ง€ํ•œ ๊ฐœ์ธํ™”๋œ ์•„๋ฐ”ํƒ€ ์ƒ์„ฑ
  • ํ…์ŠคํŠธ ํ”„๋กฌํ”„ํŠธ๋กœ ์ธ๋ฌผ์˜ ์ฐฝ์˜์ ์ธ ๋ณ€ํ˜• ์ƒ์„ฑ
  • ๋‹ค์–‘ํ•œ ์‹œ๋‚˜๋ฆฌ์˜ค์—์„œ ์ผ๊ด€๋œ ์บ๋ฆญํ„ฐ ๋””์ž์ธ ์ œ์ž‘

์ด ์‹œ์Šคํ…œ์€ Gradio๋ฅผ ํ™œ์šฉํ•œ ์ง๊ด€์ ์ธ ์›น ์ธํ„ฐํŽ˜์ด์Šค๋ฅผ ์ œ๊ณตํ•˜์—ฌ, ๊ธฐ์ˆ ์  ์ „๋ฌธ ์ง€์‹์ด ์—†๋Š” ์‚ฌ์šฉ์ž๋„ ๊ณ ๊ธ‰ AI ์ด๋ฏธ์ง€ ์ƒ์„ฑ ๊ธฐ๋Šฅ์„ ์‰ฝ๊ฒŒ ์ด์šฉํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.