File size: 2,434 Bytes
245a622
 
 
 
 
 
 
 
 
771c358
 
 
 
 
 
 
 
 
 
db18622
 
 
 
 
 
1f9070f
771c358
 
 
aec794e
 
44f4669
92e08bf
1f9070f
 
8d8eb66
44f4669
771c358
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
---
title: README
emoji: πŸ“ˆ
colorFrom: green
colorTo: red
sdk: static
pinned: false
---

# VisualCloze: A Universal Image Generation Framework via Visual In-Context Learning
    
<div align="center">
  
[[Paper](https://arxiv.org/abs/2504.07960)] &emsp; [[Project Page](https://visualcloze.github.io/)] &emsp; [[Github](https://github.com/lzyhha/VisualCloze)]

</div>

<div align="center">

[[πŸ€— Online Demo](https://huggingface.co/spaces/VisualCloze/VisualCloze)] &emsp; [[πŸ€— Dataset Card](https://huggingface.co/datasets/VisualCloze/Graph200K)]

</div>

<div align="center">
  
[[πŸ€— Full Model Card (<strong><span style="color:hotpink">Diffusers</span></strong>)](https://huggingface.co/VisualCloze/VisualClozePipeline-384)] &emsp; [[πŸ€— LoRA Model Card (<strong><span style="color:hotpink">Diffusers</span></strong>)](https://huggingface.co/VisualCloze/VisualClozePipeline-LoRA-384)]

</div>

If you find VisualCloze is helpful, please consider to star ⭐ the [<strong><span style="color:hotpink">Github Repo</span></strong>](https://github.com/lzyhha/VisualCloze). Thanks!

## πŸ“° News
- [2025-6-26] πŸš€πŸš€πŸš€ VisualCloze has been accepted by [<strong>ICCV 2025</strong>](https://iccv.thecvf.com/Conferences/2025).
- [2025-5-15] πŸ€—πŸ€—πŸ€— VisualCloze has been merged into the [<strong><span style="color:hotpink">official pipelines of diffusers</span></strong>](https://github.com/huggingface/diffusers/tree/main/src/diffusers/pipelines/visualcloze). For usage guidance, please refer to the [Full Model Card 384](https://huggingface.co/VisualCloze/VisualClozePipeline-384) and [Full Model Card 512](https://huggingface.co/VisualCloze/VisualClozePipeline-512).
- [2025-5-18] πŸ₯³πŸ₯³πŸ₯³ We have released the LoRA weights supporting diffusers at [LoRA Model Card 384](https://huggingface.co/VisualCloze/VisualClozePipeline-LoRA-384) and [LoRA Model Card 512](https://huggingface.co/VisualCloze/VisualClozePipeline-LoRA-512).

## 🌠 Key Features

An in-context learning based universal image generation framework. 

1. Support various in-domain tasks.
2. Generalize to <strong><span style="color:hotpink"> unseen tasks</span></strong> through in-context learning. 
3. Unify multiple tasks into one step and generate both target image and intermediate results. 
4. Support reverse-engineering a set of conditions from a target image.

πŸ”₯ Examples are shown in the [project page](https://visualcloze.github.io/).