DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning
[Deep Dive Coming Soon]
Release Date: January 20, 2025
The R1 model enhances reasoning capabilities through large-scale reinforcement learning, competing
directly with leading models like OpenAI's o1.
DeepSeek-V3 Technical Report
[Deep Dive Coming Soon]
Release Date: December 2024
This report discusses the scaling of sparse MoE networks to 671 billion parameters, utilizing mixed
precision training and HPC co-design strategies.
DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model
[Deep Dive Coming Soon]
Release Date: May 2024
This paper introduces a Mixture-of-Experts (MoE) architecture, enhancing performance while reducing
training costs by 42%.
DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models
[Deep Dive Coming Soon]
Release Date: April 2024
This paper presents methods to improve mathematical reasoning in LLMs, introducing the Group
Relative Policy Optimization (GRPO) algorithm.
DeepSeekLLM: Scaling Open-Source Language Models with Longer-termism
[Deep Dive Coming Soon]
Release Date: November 29, 2023
This foundational paper explores scaling laws and the trade-offs between data and model size,
establishing the groundwork for subsequent models.
DeepSeek-Prover: Advancing Theorem Proving in LLMs through Large-Scale Synthetic Data
[Deep Dive Coming Soon]
Focuses on enhancing theorem proving capabilities in language models using synthetic data for training.
DeepSeek-Coder-V2: Breaking the Barrier of Closed-Source Models in Code Intelligence
[Deep Dive Coming Soon]
This paper details advancements in code-related tasks with an emphasis on open-source methodologies,
improving upon earlier coding models.
DeepSeekMoE
[Deep Dive Coming Soon]
Discusses the integration and benefits of the Mixture-of-Experts approach within the DeepSeek framework.