Update README.md
Browse files
README.md
CHANGED
@@ -5,3 +5,111 @@
|
|
5 |
This code is used for editing vector sketches with text prompts.
|
6 |
|
7 |
<img src='docs/figures/teaser3.gif'>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
5 |
This code is used for editing vector sketches with text prompts.
|
6 |
|
7 |
<img src='docs/figures/teaser3.gif'>
|
8 |
+
|
9 |
+
## Outline
|
10 |
+
- [Installation](#installation)
|
11 |
+
- [Quick Start](#quick-start)
|
12 |
+
- [Citation](#citation)
|
13 |
+
|
14 |
+
## Installation
|
15 |
+
|
16 |
+
1. Please follow instructions in [ximinng/DiffSketcher](https://github.com/ximinng/DiffSketcher?tab=readme-ov-file#step-by-step) for the step-by-step environment preparation.
|
17 |
+
2. Download the [CompVis/stable-diffusion-v1-4](https://huggingface.co/CompVis/stable-diffusion-v1-4/tree/main) models and place them somewhere. Follow file structure [here](https://github.com/MarkMoHR/DiffSketchEdit/tree/main/StableDiffusionModels/CompVis/stable-diffusion-v1-4).
|
18 |
+
3. Finally, modify the directory path of your downloaded models to `huggingface_model_dict["sd14"]`([Line 11](https://github.com/MarkMoHR/DiffSketchEdit/blob/main/methods/diffusers_warp/__init__.py#L11)) of `./methods/diffusers_warp/__init__.py`.
|
19 |
+
|
20 |
+
## Quick Start
|
21 |
+
|
22 |
+
Use the code `run_painterly_render.py` and scroll to [Line 81](https://github.com/MarkMoHR/DiffSketchEdit/blob/main/run_painterly_render.py#L81). Then, modify the code according to the following instructions:
|
23 |
+
|
24 |
+
1. Set one or more seeds, or choose random ones.
|
25 |
+
2. Choose the editing type. `replace`, `refine` and `reweight` stand for editing modes Word Swap, Prompt Refinement and Attention Re-weighting, respectively.
|
26 |
+
3. Set the prompt information.
|
27 |
+
|
28 |
+
### Examples
|
29 |
+
|
30 |
+
(a) Word Swap (`replace`)
|
31 |
+
|
32 |
+
```
|
33 |
+
seeds_list = [25760]
|
34 |
+
args.edit_type = "replace"
|
35 |
+
|
36 |
+
PromptInfo(prompts=["A painting of a squirrel eating a burger",
|
37 |
+
"A painting of a rabbit eating a burger",
|
38 |
+
"A painting of a rabbit eating a pumpkin",
|
39 |
+
"A painting of a owl eating a pumpkin"],
|
40 |
+
token_ind=5,
|
41 |
+
changing_region_words=[["", ""], ["squirrel", "rabbit"], ["burger", "pumpkin"], ["rabbit", "owl"]])
|
42 |
+
```
|
43 |
+
|
44 |
+
- `token_ind`: indicate the index of cross-attn maps for initializing strokes.
|
45 |
+
- `changing_region_words`: for local editing. Type in two words to indicate the changing regions during each edit. Use empty strings for the first edit.
|
46 |
+
|
47 |
+
|
48 |
+
| Original image and sketch | Edited image and sketch 1 | Edited image and sketch 2 | Edited image and sketch 3 |
|
49 |
+
|:-------------:|:-------------------:|:----------------------:|:--------:|
|
50 |
+
| <img src="docs/figures/replace/ldm_generated_image0.png" style="width: 200px"> | <img src="docs/figures/replace/ldm_generated_image1.png" style="width: 200px"> | <img src="docs/figures/replace/ldm_generated_image2.png" style="width: 200px"> | <img src="docs/figures/replace/ldm_generated_image3.png" style="width: 200px"> |
|
51 |
+
| <img src="docs/figures/replace/visual_best-rendered0.png" style="width: 200px"> | <img src="docs/figures/replace/visual_best-rendered1.png" style="width: 200px"> | <img src="docs/figures/replace/visual_best-rendered2.png" style="width: 200px"> | <img src="docs/figures/replace/visual_best-rendered3.png" style="width: 200px"> |
|
52 |
+
|
53 |
+
(b) Prompt Refinement (`refine`)
|
54 |
+
|
55 |
+
```
|
56 |
+
seeds_list = [53487]
|
57 |
+
args.edit_type = "refine"
|
58 |
+
|
59 |
+
PromptInfo(prompts=["An evening dress",
|
60 |
+
"An evening dress with sleeves",
|
61 |
+
"An evening dress with sleeves and a belt"],
|
62 |
+
token_ind=3,
|
63 |
+
changing_region_words=[["", ""], ["", "sleeves"], ["", "belt"]]),
|
64 |
+
```
|
65 |
+
|
66 |
+
- `changing_region_words`: set an empty string for the first words.
|
67 |
+
|
68 |
+
| Original image and sketch | Edited image and sketch 1 | Edited image and sketch 2 |
|
69 |
+
|:-------------:|:-------------------:|:----------------------:|
|
70 |
+
| <img src="docs/figures/refine/ldm_generated_image0.png" style="width: 200px"> | <img src="docs/figures/refine/ldm_generated_image1.png" style="width: 200px"> | <img src="docs/figures/refine/ldm_generated_image2.png" style="width: 200px"> |
|
71 |
+
| <img src="docs/figures/refine/visual_best-rendered0.png" style="width: 200px"> | <img src="docs/figures/refine/visual_best-rendered1.png" style="width: 200px"> | <img src="docs/figures/refine/visual_best-rendered2.png" style="width: 200px"> |
|
72 |
+
|
73 |
+
|
74 |
+
(c) Attention Re-weighting (`reweight`)
|
75 |
+
|
76 |
+
```
|
77 |
+
seeds_list = [35491]
|
78 |
+
args.edit_type = "reweight"
|
79 |
+
|
80 |
+
PromptInfo(prompts=["An emoji face with moustache and smile"] * 3,
|
81 |
+
token_ind=3,
|
82 |
+
changing_region_words=[["", ""], ["moustache", "moustache"], ["smile", "smile"]],
|
83 |
+
reweight_word=["moustache", "smile"],
|
84 |
+
reweight_weight=[-1.0, 3.0]),
|
85 |
+
```
|
86 |
+
|
87 |
+
- `changing_region_words`: set the same words for each pair.
|
88 |
+
- `reweight_word` / `reweight_weight`: word or weight for reweighting at each edit.
|
89 |
+
|
90 |
+
|
91 |
+
| Original image and sketch | Edited image and sketch 1 | Edited image and sketch 2 |
|
92 |
+
|:-------------:|:-------------------:|:----------------------:|
|
93 |
+
| <img src="docs/figures/reweight/ldm_generated_image0.png" style="width: 200px"> | <img src="docs/figures/reweight/ldm_generated_image1.png" style="width: 200px"> | <img src="docs/figures/reweight/ldm_generated_image2.png" style="width: 200px"> |
|
94 |
+
| <img src="docs/figures/reweight/visual_best-rendered0.png" style="width: 200px"> | <img src="docs/figures/reweight/visual_best-rendered1.png" style="width: 200px"> | <img src="docs/figures/reweight/visual_best-rendered2.png" style="width: 200px"> |
|
95 |
+
|
96 |
+
|
97 |
+
## Acknowledgement
|
98 |
+
|
99 |
+
The project is built upon [ximinng/DiffSketcher](https://github.com/ximinng/DiffSketcher) and [google/prompt-to-prompt](https://github.com/google/prompt-to-prompt). We thank all the authors for their effort.
|
100 |
+
|
101 |
+
## Citation
|
102 |
+
|
103 |
+
If you use the code please cite:
|
104 |
+
|
105 |
+
```
|
106 |
+
@inproceedings{mo2024text,
|
107 |
+
title={Text-based Vector Sketch Editing with Image Editing Diffusion Prior},
|
108 |
+
author={Mo, Haoran and Gao, Chengying and Wang, Ruomei},
|
109 |
+
booktitle={2024 IEEE International Conference on Multimedia and Expo (ICME)},
|
110 |
+
pages={1--6},
|
111 |
+
year={2024},
|
112 |
+
organization={IEEE}
|
113 |
+
}
|
114 |
+
```
|
115 |
+
|