File size: 5,830 Bytes
4e9f8f2
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
# DesignEdit: Multi-Layered Latent Decomposition and Fusion for Unified & Accurate Image Editing
> *Stable Diffusion XL 1.0* Implementation

![teaser](docs/teaser.jpg)
### [Project Page](https://design-edit.github.io/)   [Paper](https://arxiv.org/abs/2403.14487)   [Hugging Face Demo](https://huggingface.co/spaces/YuhuiYuan/DesignEdit)

## ✨ News ✨

- [2024/4/4] We have supported the Gradio Application on Hugging Face 🤗, encouraging you to design online without the need for local deployment.
- [2024/3/28] We release the code for DesignEdit! Let's design together! 😍

## Setup

The required Python version is 3.10.12. , and the [Pytorch](https://pytorch.org/) version is 2.0.1.
The code's framework is built on [Prompt-to-prompt](https://github.com/google/prompt-to-prompt/) and  [Stable Diffusion](https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0).

Additional required packages are listed in the requirements file.
```bash
conda create -n DesignEdit python=3.10.12
conda activate DesignEdit
pip install -r requirements.txt
```
Notice that our model is entirely **training-free**💪!!! The base model is the Stable Diffusion XL-1.0.

## Demo
We have created an interactive interface using Gradio, as shown below. You only need to simply run the following command in the environment we previously set up:
```bash
python design_app.py
```
![page_1](docs/page01.png)

### 🖱️Usage

- We have 5 function pages for different editing operations.

💡**Object Removal**

💡**Zooming Out**

💡**Camera Panning**

💡**Object Moving, Resizing and Flipping**

💡**Multi-Layered Editing**  

- You can follow the "Usage" instructions within each page.  

![page_4](docs/page04.png)  

- For each page, we also provide some interesting examples for you to try.  

![page_2](docs/page02.png)  

- Notice that the **Multi-Layered Editing** page, which uses a multi-layered representation for multiple editing tasks, can achieve the same results as those of Object Removal and Object Moving, Resizing, and Flipping in a general representation.  

- Moreover, we have added the "Mask Preparation" page for you to utilize SAM or sketching to combine several masks together. This may be useful when you are on the **Multi-Layered Editing** page.  

![page_3](docs/page03.png)

## More Details  

If you are interested in exploring more details about the model implementation, we recommend checking out [`model.py`](design_copy/src/demo/model.py). Pay special attention to the `register_attention_control()` function and the `LayerFusion` class.  


## Applications  

For more applications, we kindly invite you to explore our [project page](https://design-edit.github.io/) and refer to our [paper](https://arxiv.org/abs/2403.14487).

### 💡Object Removal  

You can choose more than one object to remove on the **Object Removal** page, and it is also possible to mask irregular regions for removal.

<div align="center">
    <img src="docs/removal.jpg" width="700"/>
</div>

### 💡Object Removal with <span style="color:red;">Refine Mask</span>  

Using remove mask directly may cause artifacts, the refine mask indicates regions that may cause artifacts. You can turn to **Object Removal** page to explore.  

<div align="center">
    <img src="docs/refine.jpg"  width="700"/>
</div>

### 💡Camera Panning and Zooming Out  

You can use the **Camera Panning** and **Zooming Out** page to achieve editing with different scales and directions.

<div align="center">
    <img src="docs/pan.jpg" width="700"/>
</div>
<div align="center">
    <img src="docs/zoom.jpg" width="700"/>
</div>

The illustration of image adjustment and mask preparation is shown below.  

<div align="center">
    <img src="docs/pan+zoom.jpg" width="700"/>
</div>

### 💡Multi-Object Editing with Moving, Resizing, Flipping

You can achieve single object moving, resizing, flipping in **Object Moving, Resizing and Flipping** page, 
for multi-object editing like swapping and addition, you can turn to **Multi-Layered Editing** page.  

<div align="center">
    <img src="docs/multi.jpg" width="700"/>
</div>

### 💡Cross-Image Composition  

By choosing one image as the background and specifying the position, size, and placement order of the foreground images, we can achieve cross-image composition. You can try examples on the **Multi-Layered Editing** page.

<div align="center">
    <img src="docs/cross.jpg" width="700"/>
</div>

### 💡Typography Retyping  

Typography retyping refers to the specific use of design elements, which you can achieve on the **Multi-Layered Editing** page.  

<div align="center">
    <img src="docs/retype.jpg" width="700"/>
</div>

## Acknowledgements  

Our project benefits from the contributions of several outstanding projects and techniques. We express our gratitude to:

- [**Prompt-to-Prompt**](https://github.com/google/prompt-to-prompt.git): For innovative approaches in prompt engineering. 

- [**Proximal-Guidance**](https://github.com/phymhan/prompt-to-prompt.git): For their cutting-edge inversion technique, significantly improving our model's performance. 

- [**DragonDiffusion**](https://github.com/MC-E/DragonDiffusion.git): For inspiration on Gradio interface and efficient SAM API integration.

Each of these projects has played a crucial role in the development of our work. We thank their contributors for sharing their expertise and resources with the community.

## BibTeX

```bibtex
@misc{jia2024designedit,
  title={DesignEdit: Multi-Layered Latent Decomposition and Fusion for Unified & Accurate Image Editing},
  author={Yueru Jia and Yuhui Yuan and Aosong Cheng and Chuke Wang and Ji Li and Huizhu Jia and Shanghang Zhang},
  year={2024},
  eprint={2403.14487},
  archivePrefix={arXiv},
  primaryClass={cs.CV}
}
```