✨ The multimodal wave🌊 - GLM-4.1V-Thinking: Image+Text > Text - Intern-S1: Image+Text > Text - Wan 2.2 - Text +Image > video - Skywork-R1V3: Image+Text > Text - Skywork-UniPic: Text > Image / Image > Text - Tar-7B: Any-to-Any - Ming-Lite-Omni-1.5: Any-to-Any - Step3: Image+Text > Text - HunyuanWorld-1: Image > 3D - ThinkSound: Video > Audio - Neta-Lumina: Text > Image
✨ Big month not only for models, but for policy too🏛️ - Announced Global Action Plan for AI Governance - Proposes to set up a World AI Cooperation Organization in Shanghai - Released International AI Open Source Collaboration Initiative - Published Risk Assessment Guidelines for Endpoint AI Agents
✨ Big event - WAIC - 355K offline visitors - 108 new released in 4 days - 145 sessions across key domains
I’ve been tracking things closely, but July’s open-source wave still blew me away. Can’t wait to see what’s coming next! 🚀
✨ 321B total / 32B active - Apache 2.0 ✨ MFA + AFD : cutting decoding cost by up to 70% vs. DeepSeek-V3 ✨ 4T image-text pretraining: strong vision–language grounding ✨ Modular, efficient, deployable: runs on just 8×48GB GPUs