Update README.md
Browse files
README.md
CHANGED
@@ -3,19 +3,84 @@ license: apache-2.0
|
|
3 |
library_name: diffusers
|
4 |
---
|
5 |
|
6 |
-
|
7 |
-
|
8 |
-
|
9 |
-
|
10 |
-
|
11 |
-
|
12 |
-
|
13 |
-
|
14 |
-
|
15 |
-
|
16 |
-
|
17 |
-
|
18 |
-
|
19 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
20 |
}
|
21 |
-
```
|
|
|
3 |
library_name: diffusers
|
4 |
---
|
5 |
|
6 |
+
# RoboTransfer: Geometry-Consistent Video Diffusion for Robotic Visual Policy Transfer
|
7 |
+
|
8 |
+
<div align="center" class="authors">
|
9 |
+
Liu Liu,
|
10 |
+
Xiaofeng Wang,
|
11 |
+
Guosheng Zhao,
|
12 |
+
Keyu Li,
|
13 |
+
Wenkang Qin,
|
14 |
+
Jiaxiong Qiu,
|
15 |
+
Zheng Zhu,
|
16 |
+
Guan Huang,
|
17 |
+
Zhizhong Su
|
18 |
+
</div>
|
19 |
+
|
20 |
+
<div align="center" style="line-height: 3;">
|
21 |
+
<a href="https://github.com/horizonrobotics/robot_lab" target="_blank" style="margin: 2px;">
|
22 |
+
<img alt="Code" src="https://img.shields.io/badge/Code-Github-blue" style="display: inline-block; vertical-align: middle;"/>
|
23 |
+
</a>
|
24 |
+
<a href="https://horizonrobotics.github.io/robot_lab/robotransfer" target="_blank" style="margin: 2px;">
|
25 |
+
<img alt="Project Page" src="https://img.shields.io/badge/π-Project_Page-blue" style="display: inline-block; vertical-align: middle;"/>
|
26 |
+
</a>
|
27 |
+
<a href="https://arxiv.org/abs/2505.23171" target="_blank" style="margin: 2px;">
|
28 |
+
<img alt="arXiv" src="https://img.shields.io/badge/π-arXiv-b31b1b" style="display: inline-block; vertical-align: middle;"/>
|
29 |
+
</a>
|
30 |
+
<a href="https://youtu.be/dGXKtqDnm5Q" target="_blank" style="margin: 2px;">
|
31 |
+
<img alt="Video" src="https://img.shields.io/badge/π₯-Video-red" style="display: inline-block; vertical-align: middle;"/>
|
32 |
+
</a>
|
33 |
+
<a href="https://mp.weixin.qq.com/s/c9-1HPBMHIy4oEwyKnsT7Q" target="_blank" style="margin: 2px;">
|
34 |
+
<img alt="δΈζδ»η»" src="https://img.shields.io/badge/δΈζδ»η»-07C160?logo=wechat&logoColor=white" style="display: inline-block; vertical-align: middle;"/>
|
35 |
+
</a>
|
36 |
+
</div>
|
37 |
+
|
38 |
+
<div align="center">
|
39 |
+
<img src="assets/pin/robotransfer.png" width="90%" alt="RoboTransfer Overview"/>
|
40 |
+
<p style="font-size:0.8em; color:#555;">The RoboTransfer framework integrates multi-view geometry and video diffusion, enabling controllable and geometry-consistent robotic video synthesis for policy transfer.</p>
|
41 |
+
</div>
|
42 |
+
|
43 |
+
---
|
44 |
+
|
45 |
+
## π Abstract
|
46 |
+
|
47 |
+
**RoboTransfer** is a novel diffusion-based video generation framework tailored for robotic visual policy transfer. Unlike conventional approaches, RoboTransfer introduces **geometry-aware synthesis** by injecting **depth and normal priors**, ensuring multi-view consistency across dynamic robotic scenes. The method further supports **explicit control over scene components**, such as **background editing**, **object identity swapping**, and **motion specification**, offering a fine-grained video generation pipeline that benefits embodied learning.
|
48 |
+
|
49 |
+
---
|
50 |
+
|
51 |
+
## π§ Key Features
|
52 |
+
|
53 |
+
- π **Geometry-Consistent Diffusion**: Injects global 3D cues (depth, normal) and cross-view interactions for multi-view realism.
|
54 |
+
- π§© **Scene Component Control**: Enables manipulation of object attributes (pose, identity) and background features.
|
55 |
+
- π **Cross-View Conditioning**: Learns representations from multiple camera views with spatial correspondence.
|
56 |
+
- π€ **Robotic Policy Transfer**: Facilitates domain adaptation by generating synthetic training data in target domains.
|
57 |
+
|
58 |
+
---
|
59 |
+
|
60 |
+
## π¦ Resources
|
61 |
+
|
62 |
+
- **[π§ Paper (arXiv)](https://arxiv.org/abs/2505.23171)**
|
63 |
+
- **[π Project Page](https://horizonrobotics.github.io/robot_lab/robotransfer)**
|
64 |
+
- **[π₯ Video Demo](https://youtu.be/dGXKtqDnm5Q)**
|
65 |
+
- **[π» GitHub Code (Coming Soon)](https://github.com/horizonrobotics/robot_lab)**
|
66 |
+
- **[π δΈζδ»η»](https://mp.weixin.qq.com/s/c9-1HPBMHIy4oEwyKnsT7Q)**
|
67 |
+
|
68 |
+
---
|
69 |
+
|
70 |
+
## πΈ Framework Overview
|
71 |
+
|
72 |
+

|
73 |
+
|
74 |
+
> The overall architecture includes view-specific encoding, geometry injection, diffusion denoising with spatial constraints, and component-level editing modules. Our system enables compositional control over scene dynamics while preserving physical and geometric consistency.
|
75 |
+
|
76 |
+
---
|
77 |
+
|
78 |
+
## π BibTeX
|
79 |
+
|
80 |
+
```bibtex
|
81 |
+
@article{liu2025robotransfer,
|
82 |
+
title={RoboTransfer: Geometry-Consistent Video Diffusion for Robotic Visual Policy Transfer},
|
83 |
+
author={Liu, Liu and Wang, Xiaofeng and Zhao, Guosheng and Li, Keyu and Qin, Wenkang and Qiu, Jiaxiong and Zhu, Zheng and Huang, Guan and Su, Zhizhong},
|
84 |
+
journal={arXiv preprint arXiv:2505.23171},
|
85 |
+
year={2025}
|
86 |
}
|
|