What Open-Source Developers Need to Know about the EU AI Act's Rules for GPAI Models By yjernite and 5 others • 5 days ago • 21
Introducing Command A Vision: Multimodal AI built for Business By CohereLabs and 3 others • 9 days ago • 61
LLM agent experiment with a purpose-built RPG and tool calls. (Work in progress) By neph1 • 3 days ago • 6
Why We Built the OpenMDW License: A Comprehensive License for ML Models By linuxfoundation • Jul 2 • 23
makeMoE: Implement a Sparse Mixture of Experts Language Model from Scratch By AviSoori1x • May 7, 2024 • 93
DeepSeek-R1 Dissection: Understanding PPO & GRPO Without Any Prior Reinforcement Learning Knowledge By NormalUhr • Feb 7 • 199
Unsupervised Model Improvement via Internal Coherence Maximization: Outperforming Human-Supervised Methods Through Self-Elicitation By codelion • 6 days ago • 4
Navigating the RLHF Landscape: From Policy Gradients to PPO, GAE, and DPO for LLM Alignment By NormalUhr • Feb 11 • 58
What Open-Source Developers Need to Know about the EU AI Act's Rules for GPAI Models By yjernite and 5 others • 5 days ago • 21
Introducing Command A Vision: Multimodal AI built for Business By CohereLabs and 3 others • 9 days ago • 61
LLM agent experiment with a purpose-built RPG and tool calls. (Work in progress) By neph1 • 3 days ago • 6
Why We Built the OpenMDW License: A Comprehensive License for ML Models By linuxfoundation • Jul 2 • 23
makeMoE: Implement a Sparse Mixture of Experts Language Model from Scratch By AviSoori1x • May 7, 2024 • 93
DeepSeek-R1 Dissection: Understanding PPO & GRPO Without Any Prior Reinforcement Learning Knowledge By NormalUhr • Feb 7 • 199
Unsupervised Model Improvement via Internal Coherence Maximization: Outperforming Human-Supervised Methods Through Self-Elicitation By codelion • 6 days ago • 4
Navigating the RLHF Landscape: From Policy Gradients to PPO, GAE, and DPO for LLM Alignment By NormalUhr • Feb 11 • 58