MiCo: Multi-image Contrast for Reinforcement Visual Reasoning Paper • 2506.22434 • Published Jun 27 • 10
Squeeze3D: Your 3D Generation Model is Secretly an Extreme Neural Compressor Paper • 2506.07932 • Published Jun 9 • 12
OmniSVG: A Unified Scalable Vector Graphics Generation Model Paper • 2504.06263 • Published Apr 8 • 180
Can Vision-Language Models Answer Face to Face Questions in the Real-World? Paper • 2503.19356 • Published Mar 25 • 2
FFN Fusion: Rethinking Sequential Computation in Large Language Models Paper • 2503.18908 • Published Mar 24 • 20
LLaVA-o1: Let Vision Language Models Reason Step-by-Step Paper • 2411.10440 • Published Nov 15, 2024 • 128
Implicit Neural Representations with Fourier Kolmogorov-Arnold Networks Paper • 2409.09323 • Published Sep 14, 2024 • 5
view article Article SmolLM - blazingly fast and remarkably powerful By loubnabnl and 2 others • Jul 16, 2024 • 407
EvTexture: Event-driven Texture Enhancement for Video Super-Resolution Paper • 2406.13457 • Published Jun 19, 2024 • 17
Chameleon: Mixed-Modal Early-Fusion Foundation Models Paper • 2405.09818 • Published May 16, 2024 • 131
view article Article SVGDreamer: Text Guided Vector Graphics Generation with Diffusion Model By xingxm • Apr 19, 2024 • 12
SEE-2-SOUND: Zero-Shot Spatial Environment-to-Spatial Sound Paper • 2406.06612 • Published Jun 6, 2024 • 16