view article Article ๐ค๐๐ฌ๐ฅ๏ธ๐ Kimi-VL-A3B-Thinking-2506: A Quick Navigation By moonshotai and 1 other โข Jun 21 โข 66
EmbodiedEval: Evaluate Multimodal LLMs as Embodied Agents Paper โข 2501.11858 โข Published Jan 21 โข 7
Euclid: Supercharging Multimodal LLMs with Synthetic High-Fidelity Visual Descriptions Paper โข 2412.08737 โข Published Dec 11, 2024 โข 55
ACDiT: Interpolating Autoregressive Conditional Modeling and Diffusion Transformer Paper โข 2412.07720 โข Published Dec 10, 2024 โข 32
NVILA: Efficient Frontier Visual Language Models Paper โข 2412.04468 โข Published Dec 5, 2024 โข 60
Internet of Agents: Weaving a Web of Heterogeneous Agents for Collaborative Intelligence Paper โข 2407.07061 โข Published Jul 9, 2024 โข 28
Beyond the Turn-Based Game: Enabling Real-Time Conversations with Duplex Models Paper โข 2406.15718 โข Published Jun 22, 2024 โข 14
Large Multilingual Models Pivot Zero-Shot Multimodal Learning across Languages Paper โข 2308.12038 โข Published Aug 23, 2023 โข 2