Spaces:
Running
Running
title: README | |
emoji: π | |
colorFrom: green | |
colorTo: red | |
sdk: static | |
pinned: false | |
# VisualCloze: A Universal Image Generation Framework via Visual In-Context Learning | |
<div align="center"> | |
[[Paper](https://arxiv.org/abs/2504.07960)]   [[Project Page](https://visualcloze.github.io/)]   [[Github](https://github.com/lzyhha/VisualCloze)] | |
</div> | |
<div align="center"> | |
[[π€ Online Demo](https://huggingface.co/spaces/VisualCloze/VisualCloze)]   [[π€ Dataset Card](https://huggingface.co/datasets/VisualCloze/Graph200K)] | |
</div> | |
<div align="center"> | |
[[π€ Full Model Card (<strong><span style="color:hotpink">Diffusers</span></strong>)](https://huggingface.co/VisualCloze/VisualClozePipeline-384)]   [[π€ LoRA Model Card (<strong><span style="color:hotpink">Diffusers</span></strong>)](https://huggingface.co/VisualCloze/VisualClozePipeline-LoRA-384)] | |
</div> | |
If you find VisualCloze is helpful, please consider to star β the [<strong><span style="color:hotpink">Github Repo</span></strong>](https://github.com/lzyhha/VisualCloze). Thanks! | |
## π° News | |
- [2025-6-26] πππ VisualCloze has been accepted by [<strong>ICCV 2025</strong>](https://iccv.thecvf.com/Conferences/2025). | |
- [2025-5-15] π€π€π€ VisualCloze has been merged into the [<strong><span style="color:hotpink">official pipelines of diffusers</span></strong>](https://github.com/huggingface/diffusers/tree/main/src/diffusers/pipelines/visualcloze). For usage guidance, please refer to the [Full Model Card 384](https://huggingface.co/VisualCloze/VisualClozePipeline-384) and [Full Model Card 512](https://huggingface.co/VisualCloze/VisualClozePipeline-512). | |
- [2025-5-18] π₯³π₯³π₯³ We have released the LoRA weights supporting diffusers at [LoRA Model Card 384](https://huggingface.co/VisualCloze/VisualClozePipeline-LoRA-384) and [LoRA Model Card 512](https://huggingface.co/VisualCloze/VisualClozePipeline-LoRA-512). | |
## π Key Features | |
An in-context learning based universal image generation framework. | |
1. Support various in-domain tasks. | |
2. Generalize to <strong><span style="color:hotpink"> unseen tasks</span></strong> through in-context learning. | |
3. Unify multiple tasks into one step and generate both target image and intermediate results. | |
4. Support reverse-engineering a set of conditions from a target image. | |
π₯ Examples are shown in the [project page](https://visualcloze.github.io/). | |