HorizonRobotics
/

RoboTransfer

RoboTransferPipeline

Model card Files Files and versions Community

nemo04 commited on Jul 18

Commit

ef6a2e5

·

verified ·

1 Parent(s): 4f510f3

Update README.md

Files changed (1) hide show

README.md +3 -10

README.md CHANGED Viewed

@@ -36,12 +36,14 @@ library_name: diffusers
 </div>
 <div align="center">
-  <img src="assets/pin.png" width="90%" alt="RoboTransfer"/></div>
 ---
 ## 🔍 Abstract
 **RoboTransfer** is a novel diffusion-based video generation framework tailored for robotic visual policy transfer. Unlike conventional approaches, RoboTransfer introduces **geometry-aware synthesis** by injecting **depth and normal priors**, ensuring multi-view consistency across dynamic robotic scenes. The method further supports **explicit control over scene components**, such as **background editing**, **object identity swapping**, and **motion specification**, offering a fine-grained video generation pipeline that benefits embodied learning.
 ---
@@ -55,15 +57,6 @@ library_name: diffusers
 ---
-## 📸 Framework Overview
-![RoboTransfer Pipeline](assets/robotransfer_pipeline.png)
-> The overall architecture includes view-specific encoding, geometry injection, diffusion denoising with spatial constraints, and component-level editing modules. Our system enables compositional control over scene dynamics while preserving physical and geometric consistency.
----
 ## 📖 BibTeX
 ```bibtex

 </div>
 <div align="center">
+  <img src="assets/pin.jpeg" width="50%" alt="RoboTransfer"/></div>
 ---
 ## 🔍 Abstract
+![RoboTransfer Pipeline](assets/robotransfer_pipeline.jpeg)
 **RoboTransfer** is a novel diffusion-based video generation framework tailored for robotic visual policy transfer. Unlike conventional approaches, RoboTransfer introduces **geometry-aware synthesis** by injecting **depth and normal priors**, ensuring multi-view consistency across dynamic robotic scenes. The method further supports **explicit control over scene components**, such as **background editing**, **object identity swapping**, and **motion specification**, offering a fine-grained video generation pipeline that benefits embodied learning.
 ---
 ---
 ## 📖 BibTeX
 ```bibtex