vfontech
/

Multiple-Input-Resshift-VFI

pytorch_model_hub_mixin

video-frame-interpolation

uncertainty-estimation

Model card Files Files and versions Community

vfontech commited on Apr 18

Commit

8fd2d97

·

verified ·

1 Parent(s): 0d42fed

Update README.md

Files changed (1) hide show

README.md +72 -6

README.md CHANGED Viewed

@@ -1,11 +1,77 @@
 ---
-language: en
 tags:
-- model_hub_mixin
 - pytorch_model_hub_mixin
 ---
-This model has been pushed to the Hub using the [PytorchModelHubMixin](https://huggingface.co/docs/huggingface_hub/package_reference/mixins#huggingface_hub.PyTorchModelHubMixin) integration:
-- Code: https://github.com/VicFonch/Multi-Input-Resshift-Diffusion-VFI
-- Paper: https://arxiv.org/pdf/2504.05402
-- Docs: [More Information Needed]

 ---
+language:
+- en
 tags:
 - pytorch_model_hub_mixin
+- animation
+- video-frame-interpolation
+- uncertainty-estimation
+license: mit
+pipeline_tag: image-to-image
 ---
+# 🤖 Multi‑Input ResShift Diffusion VFI
+<div align="left" style="display: flex; flex-direction: row; gap: 15px">
+  <a href='https://arxiv.org/pdf/2504.05402'><img src='https://img.shields.io/badge/arXiv-2405.17933-b31b1b.svg'></a>
+  <a href='https://github.com/VicFonch/Multi-Input-Resshift-Diffusion-VFI'><img src='https://img.shields.io/badge/Repo-Code-blue'></a>
+  <a href='https://colab.research.google.com/drive/1MGYycbNMW6Mxu5MUqw_RW_xxiVeHK5Aa#scrollTo=EKaYCioiP3tQ'><img src='https://img.shields.io/badge/Colab-Demo-Green'></a>
+</div>
+## ⚙️ Setup
+Start by downloading the source code directly from GitHub.
+```bash
+git clone https://github.com/VicFonch/Multi-Input-Resshift-Diffusion-VFI.git
+```
+Create a conda environment and install all the requirements
+```bash
+conda create -n multi-input-resshift python=3.10
+conda activate multi-input-resshift
+pip install -r requirements.txt
+```
+**Note**: Make sure your system is compatible with **CUDA 12.4**. If not, install [CuPy](https://docs.cupy.dev/en/stable/install.html) according to your current CUDA version.
+## 🚀 Inference Example
+```python
+import os
+from PIL import Image
+import numpy as np
+import matplotlib.pyplot as plt
+from torchvision.transforms import Compose, ToTensor, Resize, Normalize
+from utils.utils import denorm
+from model.hub import MultiInputResShiftHub
+model = MultiInputResShiftHub.from_pretrained("vfontech/Multiple-Input-Resshift-VFI")
+model.requires_grad_(False).cuda().eval()
+img0_path = r"_data\example_images\frame1.png"
+img2_path = r"_data\example_images\frame3.png"
+transforms = Compose([
+    Resize((256, 448)),
+    ToTensor(),
+    Normalize(mean=[0.5, 0.5, 0.5], std=[0.5, 0.5, 0.5]),
+])
+img0 = transforms(Image.open(img0_path).convert("RGB")).unsqueeze(0).cuda()
+img2 = transforms(Image.open(img2_path).convert("RGB")).unsqueeze(0).cuda()
+tau = 0.5
+img1 = model.reverse_process([img0, img2], tau)
+plt.figure(figsize=(10, 5))
+plt.subplot(1, 3, 1)
+plt.imshow(denorm(img0, mean=[0.5, 0.5, 0.5], std=[0.5, 0.5, 0.5]).squeeze().permute(1, 2, 0).cpu().numpy())
+plt.subplot(1, 3, 2)
+plt.imshow(denorm(img1, mean=[0.5, 0.5, 0.5], std=[0.5, 0.5, 0.5]).squeeze().permute(1, 2, 0).cpu().numpy())
+plt.subplot(1, 3, 3)
+plt.imshow(denorm(img2, mean=[0.5, 0.5, 0.5], std=[0.5, 0.5, 0.5]).squeeze().permute(1, 2, 0).cpu().numpy())
+plt.show()
+```