File size: 5,830 Bytes
4e9f8f2 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 |
# DesignEdit: Multi-Layered Latent Decomposition and Fusion for Unified & Accurate Image Editing
> *Stable Diffusion XL 1.0* Implementation

### [Project Page](https://design-edit.github.io/)   [Paper](https://arxiv.org/abs/2403.14487)   [Hugging Face Demo](https://huggingface.co/spaces/YuhuiYuan/DesignEdit)
## ✨ News ✨
- [2024/4/4] We have supported the Gradio Application on Hugging Face 🤗, encouraging you to design online without the need for local deployment.
- [2024/3/28] We release the code for DesignEdit! Let's design together! 😍
## Setup
The required Python version is 3.10.12. , and the [Pytorch](https://pytorch.org/) version is 2.0.1.
The code's framework is built on [Prompt-to-prompt](https://github.com/google/prompt-to-prompt/) and [Stable Diffusion](https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0).
Additional required packages are listed in the requirements file.
```bash
conda create -n DesignEdit python=3.10.12
conda activate DesignEdit
pip install -r requirements.txt
```
Notice that our model is entirely **training-free**💪!!! The base model is the Stable Diffusion XL-1.0.
## Demo
We have created an interactive interface using Gradio, as shown below. You only need to simply run the following command in the environment we previously set up:
```bash
python design_app.py
```

### 🖱️Usage
- We have 5 function pages for different editing operations.
💡**Object Removal**
💡**Zooming Out**
💡**Camera Panning**
💡**Object Moving, Resizing and Flipping**
💡**Multi-Layered Editing**
- You can follow the "Usage" instructions within each page.

- For each page, we also provide some interesting examples for you to try.

- Notice that the **Multi-Layered Editing** page, which uses a multi-layered representation for multiple editing tasks, can achieve the same results as those of Object Removal and Object Moving, Resizing, and Flipping in a general representation.
- Moreover, we have added the "Mask Preparation" page for you to utilize SAM or sketching to combine several masks together. This may be useful when you are on the **Multi-Layered Editing** page.

## More Details
If you are interested in exploring more details about the model implementation, we recommend checking out [`model.py`](design_copy/src/demo/model.py). Pay special attention to the `register_attention_control()` function and the `LayerFusion` class.
## Applications
For more applications, we kindly invite you to explore our [project page](https://design-edit.github.io/) and refer to our [paper](https://arxiv.org/abs/2403.14487).
### 💡Object Removal
You can choose more than one object to remove on the **Object Removal** page, and it is also possible to mask irregular regions for removal.
<div align="center">
<img src="docs/removal.jpg" width="700"/>
</div>
### 💡Object Removal with <span style="color:red;">Refine Mask</span>
Using remove mask directly may cause artifacts, the refine mask indicates regions that may cause artifacts. You can turn to **Object Removal** page to explore.
<div align="center">
<img src="docs/refine.jpg" width="700"/>
</div>
### 💡Camera Panning and Zooming Out
You can use the **Camera Panning** and **Zooming Out** page to achieve editing with different scales and directions.
<div align="center">
<img src="docs/pan.jpg" width="700"/>
</div>
<div align="center">
<img src="docs/zoom.jpg" width="700"/>
</div>
The illustration of image adjustment and mask preparation is shown below.
<div align="center">
<img src="docs/pan+zoom.jpg" width="700"/>
</div>
### 💡Multi-Object Editing with Moving, Resizing, Flipping
You can achieve single object moving, resizing, flipping in **Object Moving, Resizing and Flipping** page,
for multi-object editing like swapping and addition, you can turn to **Multi-Layered Editing** page.
<div align="center">
<img src="docs/multi.jpg" width="700"/>
</div>
### 💡Cross-Image Composition
By choosing one image as the background and specifying the position, size, and placement order of the foreground images, we can achieve cross-image composition. You can try examples on the **Multi-Layered Editing** page.
<div align="center">
<img src="docs/cross.jpg" width="700"/>
</div>
### 💡Typography Retyping
Typography retyping refers to the specific use of design elements, which you can achieve on the **Multi-Layered Editing** page.
<div align="center">
<img src="docs/retype.jpg" width="700"/>
</div>
## Acknowledgements
Our project benefits from the contributions of several outstanding projects and techniques. We express our gratitude to:
- [**Prompt-to-Prompt**](https://github.com/google/prompt-to-prompt.git): For innovative approaches in prompt engineering.
- [**Proximal-Guidance**](https://github.com/phymhan/prompt-to-prompt.git): For their cutting-edge inversion technique, significantly improving our model's performance.
- [**DragonDiffusion**](https://github.com/MC-E/DragonDiffusion.git): For inspiration on Gradio interface and efficient SAM API integration.
Each of these projects has played a crucial role in the development of our work. We thank their contributors for sharing their expertise and resources with the community.
## BibTeX
```bibtex
@misc{jia2024designedit,
title={DesignEdit: Multi-Layered Latent Decomposition and Fusion for Unified & Accurate Image Editing},
author={Yueru Jia and Yuhui Yuan and Aosong Cheng and Chuke Wang and Ji Li and Huizhu Jia and Shanghang Zhang},
year={2024},
eprint={2403.14487},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
```
|