CASA: Cross-Attention via Self-Attention for Efficient Vision-Language Fusion Paper • 2512.19535 • Published 5 days ago • 8
TurboDiffusion: Accelerating Video Diffusion Models by 100-200 Times Paper • 2512.16093 • Published 9 days ago • 64
The Prism Hypothesis: Harmonizing Semantic and Pixel Representations via Unified Autoencoding Paper • 2512.19693 • Published 4 days ago • 60
DataFlow: An LLM-Driven Framework for Unified Data Preparation and Workflow Automation in the Era of Data-Centric AI Paper • 2512.16676 • Published 9 days ago • 188
Probing Scientific General Intelligence of LLMs with Scientist-Aligned Workflows Paper • 2512.16969 • Published 9 days ago • 105
HyperVL: An Efficient and Dynamic Multimodal Large Language Model for Edge Devices Paper • 2512.14052 • Published 11 days ago • 39
Are We Ready for RL in Text-to-3D Generation? A Progressive Investigation Paper • 2512.10949 • Published 15 days ago • 45
Towards Scalable Pre-training of Visual Tokenizers for Generation Paper • 2512.13687 • Published 11 days ago • 95
ReFusion: A Diffusion Large Language Model with Parallel Autoregressive Decoding Paper • 2512.13586 • Published 11 days ago • 87
StereoWorld: Geometry-Aware Monocular-to-Stereo Video Generation Paper • 2512.09363 • Published 17 days ago • 70
PaperDebugger: A Plugin-Based Multi-Agent System for In-Editor Academic Writing, Review, and Editing Paper • 2512.02589 • Published 25 days ago • 63
ToolOrchestra: Elevating Intelligence via Efficient Model and Tool Orchestration Paper • 2511.21689 • Published about 1 month ago • 109
Harmony: Harmonizing Audio and Video Generation through Cross-Task Synergy Paper • 2511.21579 • Published about 1 month ago • 23
ENACT: Evaluating Embodied Cognition with World Modeling of Egocentric Interaction Paper • 2511.20937 • Published Nov 26 • 15
MIRA: Multimodal Iterative Reasoning Agent for Image Editing Paper • 2511.21087 • Published about 1 month ago • 10
Infinity-RoPE: Action-Controllable Infinite Video Generation Emerges From Autoregressive Self-Rollout Paper • 2511.20649 • Published Nov 25 • 45
DeepSeek-V3.2: Pushing the Frontier of Open Large Language Models Paper • 2512.02556 • Published 25 days ago • 234