Enhanced OoD Detection through Cross-Modal Alignment of Multi-Modal Representations Paper • 2503.18817 • Published Mar 24 • 4
HumanDreamer-X: Photorealistic Single-image Human Avatars Reconstruction via Gaussian Restoration Paper • 2504.03536 • Published Apr 4 • 13
TransMamba: Flexibly Switching between Transformer and Mamba Paper • 2503.24067 • Published Mar 31 • 21
BOP Challenge 2024 on Model-Based and Model-Free 6D Object Pose Estimation Paper • 2504.02812 • Published Apr 3 • 5
RobustDexGrasp: Robust Dexterous Grasping of General Objects from Single-view Perception Paper • 2504.05287 • Published Apr 7 • 6
MonoPlace3D: Learning 3D-Aware Object Placement for 3D Monocular Detection Paper • 2504.06801 • Published Apr 9 • 5
Geo4D: Leveraging Video Generators for Geometric 4D Scene Reconstruction Paper • 2504.07961 • Published Apr 10 • 6
In-2-4D: Inbetweening from Two Single-View Images to 4D Generation Paper • 2504.08366 • Published Apr 11 • 11
PVUW 2025 Challenge Report: Advances in Pixel-level Understanding of Complex Videos in the Wild Paper • 2504.11326 • Published Apr 15 • 6
openai/whisper-large-v2 Automatic Speech Recognition • 2B • Updated Feb 29, 2024 • 207k • 1.75k