UniVideo: Unified Understanding, Generation, and Editing for Videos Paper • 2510.08377 • Published Oct 9, 2025 • 71
EditReward: A Human-Aligned Reward Model for Instruction-Guided Image Editing Paper • 2509.26346 • Published Sep 30, 2025 • 18
POINTS-Reader: Distillation-Free Adaptation of Vision-Language Models for Document Conversion Paper • 2509.01215 • Published Sep 1, 2025 • 50
CLiFT: Compressive Light-Field Tokens for Compute-Efficient and Adaptive Neural Rendering Paper • 2507.08776 • Published Jul 11, 2025 • 54
CLiFT: Compressive Light-Field Tokens for Compute-Efficient and Adaptive Neural Rendering Paper • 2507.08776 • Published Jul 11, 2025 • 54
facebook/vjepa2-vitl-fpc64-256 Video Classification • 0.3B • Updated Aug 11, 2025 • 81.6k • 169
zai-org/GLM-4.1V-9B-Thinking Image-Text-to-Text • 10B • Updated Oct 25, 2025 • 178k • • 762
BlenderFusion: 3D-Grounded Visual Editing and Generative Compositing Paper • 2506.17450 • Published Jun 20, 2025 • 64
BlenderFusion: 3D-Grounded Visual Editing and Generative Compositing Paper • 2506.17450 • Published Jun 20, 2025 • 64
BlenderFusion: 3D-Grounded Visual Editing and Generative Compositing Paper • 2506.17450 • Published Jun 20, 2025 • 64 • 1
Ego-R1: Chain-of-Tool-Thought for Ultra-Long Egocentric Video Reasoning Paper • 2506.13654 • Published Jun 16, 2025 • 43
Pixel Reasoner: Incentivizing Pixel-Space Reasoning with Curiosity-Driven Reinforcement Learning Paper • 2505.15966 • Published May 21, 2025 • 53