N3D-VLM: Native 3D Grounding Enables Accurate Spatial Reasoning in Vision-Language Models Paper • 2512.16561 • Published 7 days ago • 19
Mask Transfiner for High-Quality Instance Segmentation Paper • 2111.13673 • Published Nov 26, 2021
Cascade-DETR: Delving into High-Quality Universal Object Detection Paper • 2307.11035 • Published Jul 20, 2023
Gaussian Grouping: Segment and Edit Anything in 3D Scenes Paper • 2312.00732 • Published Dec 1, 2023 • 3
DreamScene4D: Dynamic Multi-Object Scene Generation from Monocular Videos Paper • 2405.02280 • Published May 3, 2024
SLAck: Semantic, Location, and Appearance Aware Open-Vocabulary Tracking Paper • 2409.11235 • Published Sep 17, 2024
RobotArena $\infty$: Scalable Robot Benchmarking via Real-to-Sim Translation Paper • 2510.23571 • Published Oct 27 • 8
MotionEdit: Benchmarking and Learning Motion-Centric Image Editing Paper • 2512.10284 • Published 14 days ago • 25
TAPIP3D: Tracking Any Point in Persistent 3D Geometry Paper • 2504.14717 • Published Apr 20 • 8