Bigger isn't always better: how to choose the most efficient model for context-specific tasks 🌱🧑🏼💻 By sasha • 1 day ago • 12
OpenEvolve: An Open Source Implementation of Google DeepMind's AlphaEvolve By codelion • 9 days ago • 17
Fine-Tuning Your First Large Language Model (LLM) with PyTorch and Hugging Face By dvgodoy • Feb 11 • 37
Mitigating False Negatives in Multiple Negatives Ranking Loss for Retriever Training By dragonkue • 4 days ago • 6
AgenticSeek: Running Manus AI Locally with Deepseek & Qwen (Open Source Tool) By lynn-mikami • 5 days ago • 5
Navigating the RLHF Landscape: From Policy Gradients to PPO, GAE, and DPO for LLM Alignment By NormalUhr • Feb 11 • 40
DeepSeek-R1 Dissection: Understanding PPO & GRPO Without Any Prior Reinforcement Learning Knowledge By NormalUhr • Feb 7 • 142
Manus AI: The Best Autonomous AI Agent Redefining Automation and Productivity By LLMhacker • Mar 6 • 171
Bigger isn't always better: how to choose the most efficient model for context-specific tasks 🌱🧑🏼💻 By sasha • 1 day ago • 12
OpenEvolve: An Open Source Implementation of Google DeepMind's AlphaEvolve By codelion • 9 days ago • 17
Fine-Tuning Your First Large Language Model (LLM) with PyTorch and Hugging Face By dvgodoy • Feb 11 • 37
Mitigating False Negatives in Multiple Negatives Ranking Loss for Retriever Training By dragonkue • 4 days ago • 6
AgenticSeek: Running Manus AI Locally with Deepseek & Qwen (Open Source Tool) By lynn-mikami • 5 days ago • 5
Navigating the RLHF Landscape: From Policy Gradients to PPO, GAE, and DPO for LLM Alignment By NormalUhr • Feb 11 • 40
DeepSeek-R1 Dissection: Understanding PPO & GRPO Without Any Prior Reinforcement Learning Knowledge By NormalUhr • Feb 7 • 142
Manus AI: The Best Autonomous AI Agent Redefining Automation and Productivity By LLMhacker • Mar 6 • 171