File size: 5,350 Bytes
7bc5051
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
759cfd3
7bc5051
 
 
 
759cfd3
7bc5051
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
759cfd3
7bc5051
 
 
 
759cfd3
7bc5051
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
759cfd3
7bc5051
 
 
 
759cfd3
 
7bc5051
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
759cfd3
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
# FramePack Image Edit Early Lora

This repository contains the necessary steps and scripts to generate A edit of the Image using a image-to-video model. 
The model leverages LoRA (Low-Rank Adaptation) weights and pre-trained components to create Edit Image based on a input Image and textual prompts.

## Prerequisites

Before proceeding, ensure that you have the following installed on your system:

• **Ubuntu** (or a compatible Linux distribution)
• **Python 3.x****pip** (Python package manager)
• **Git****Git LFS** (Git Large File Storage)
• **FFmpeg**

## Installation

1. **Update and Install Dependencies**

   ```bash
   sudo apt-get update && sudo apt-get install cbm git-lfs ffmpeg
   ```

2. **Clone the Repository**

   ```bash
   git clone https://huggingface.co/svjack/FramePack_Image_Edit_Lora_Early
   cd FramePack_Image_Edit_Lora_Early
   ```

3. **Install Python Dependencies**

   ```bash
   pip install torch torchvision
   pip install -r requirements.txt
   pip install ascii-magic matplotlib tensorboard huggingface_hub datasets
   pip install moviepy==1.0.3
   pip install sageattention==1.0.6
   ```

4. **Download Model Weights**

   ```bash
    git clone https://huggingface.co/lllyasviel/FramePackI2V_HY
    git clone https://huggingface.co/hunyuanvideo-community/HunyuanVideo
    git clone https://huggingface.co/Comfy-Org/HunyuanVideo_repackaged
    git clone https://huggingface.co/Comfy-Org/sigclip_vision_384
   ```

## Usage

To Edit a Image, use the `fpack_generate_video.py` script with the appropriate parameters. Below are examples of how to do it.
Make sure use 512x512 as output (this used to train it.)

* 1 Add a cat
- Input

![image/jpeg](https://cdn-uploads.huggingface.co/production/uploads/634dffc49b777beec3bc6448/RBfuHn0musmr9BBP5LLib.jpeg)

```python
python fpack_generate_video.py \
    --dit FramePackI2V_HY/diffusion_pytorch_model-00001-of-00003.safetensors \
    --vae HunyuanVideo/vae/diffusion_pytorch_model.safetensors \
    --text_encoder1 HunyuanVideo_repackaged/split_files/text_encoders/llava_llama3_fp16.safetensors \
    --text_encoder2 HunyuanVideo_repackaged/split_files/text_encoders/clip_l.safetensors \
    --image_encoder sigclip_vision_384/sigclip_vision_patch14_384.safetensors \
    --image_path xiang_image.jpg \
    --prompt "add a cat into the picture" \
    --video_size 512  512 --fps 30 --infer_steps 25 \
    --attn_mode sdpa --fp8_scaled \
    --vae_chunk_size 32 --vae_spatial_tile_sample_min_size 128 \
    --save_path save --video_sections 1 --output_type latent_images --one_frame_inference zero_post \
    --seed 1234 --lora_multiplier 1.0 --lora_weight framepack_edit_output/framepack-edit-lora-000005.safetensors
```

- Output

![image/png](https://cdn-uploads.huggingface.co/production/uploads/634dffc49b777beec3bc6448/IYHMeLY0Bo7ERBM1Kn9dv.png)

* 2 Change Background
- Input

![image/jpeg](https://cdn-uploads.huggingface.co/production/uploads/634dffc49b777beec3bc6448/bkr_zXRZlhbyvpD3ZpbaJ.jpeg)

```python
python fpack_generate_video.py \
    --dit FramePackI2V_HY/diffusion_pytorch_model-00001-of-00003.safetensors \
    --vae HunyuanVideo/vae/diffusion_pytorch_model.safetensors \
    --text_encoder1 HunyuanVideo_repackaged/split_files/text_encoders/llava_llama3_fp16.safetensors \
    --text_encoder2 HunyuanVideo_repackaged/split_files/text_encoders/clip_l.safetensors \
    --image_encoder sigclip_vision_384/sigclip_vision_patch14_384.safetensors \
    --image_path wanye.jpg \
    --prompt "Change the background into a restaurant in anime style. Keep the character's eye colors and white hair unchanged." \
    --video_size 512  512 --fps 30 --infer_steps 25 \
    --attn_mode sdpa --fp8_scaled \
    --vae_chunk_size 32 --vae_spatial_tile_sample_min_size 128 \
    --save_path save --video_sections 1 --output_type latent_images --one_frame_inference zero_post \
    --seed 1234 --lora_multiplier 1.0 --lora_weight framepack_edit_output/framepack-edit-lora-000005.safetensors

```

- Output

![image/png](https://cdn-uploads.huggingface.co/production/uploads/634dffc49b777beec3bc6448/6OZxyxH32pEQsT6IwQOrR.png)

* 3 Place Train into landscape 
- Input

![image/jpeg](https://cdn-uploads.huggingface.co/production/uploads/634dffc49b777beec3bc6448/8_2JAEGqX_gfniKgHJ6hy.jpeg)

```python
python fpack_generate_video.py \
    --dit FramePackI2V_HY/diffusion_pytorch_model-00001-of-00003.safetensors \
    --vae HunyuanVideo/vae/diffusion_pytorch_model.safetensors \
    --text_encoder1 HunyuanVideo_repackaged/split_files/text_encoders/llava_llama3_fp16.safetensors \
    --text_encoder2 HunyuanVideo_repackaged/split_files/text_encoders/clip_l.safetensors \
    --image_encoder sigclip_vision_384/sigclip_vision_patch14_384.safetensors \
    --image_path train.jpg \
    --prompt "place the train into a beautiful landscape" \
    --video_size 512  512 --fps 30 --infer_steps 25 \
    --attn_mode sdpa --fp8_scaled \
    --vae_chunk_size 32 --vae_spatial_tile_sample_min_size 128 \
    --save_path save --video_sections 1 --output_type latent_images --one_frame_inference zero_post \
    --seed 1234 --lora_multiplier 1.0 --lora_weight framepack_edit_output/framepack-edit-lora-000005.safetensors
```

- Output

![image/png](https://cdn-uploads.huggingface.co/production/uploads/634dffc49b777beec3bc6448/-8oAOtrZCLL0tReIJEKWP.png)