cedric (Cedric Chee)

upvoted a collection 2 months ago

DeepSeek-R1

Collection

10 items • Updated May 29 • 773

upvoted 3 papers 3 months ago

Scaling Reasoning can Improve Factuality in Large Language Models

Paper • 2505.11140 • Published May 16 • 7

Beyond 'Aha!': Toward Systematic Meta-Abilities Alignment in Large Reasoning Models

Paper • 2505.10554 • Published May 15 • 120

Insights into DeepSeek-V3: Scaling Challenges and Reflections on Hardware for AI Architectures

Paper • 2505.09343 • Published May 14 • 68

upvoted 2 collections 3 months ago

Gemma 3 Release

Collection

28 items • Updated about 19 hours ago • 430

Qwen3

Collection

84 items • Updated 6 days ago • 1.06k

upvoted 3 collections 4 months ago

upvoted a paper 4 months ago

Hermes 3 Technical Report

Paper • 2408.11857 • Published Aug 15, 2024 • 53

upvoted 3 collections 4 months ago

Llama Nemotron

Collection

Open, Production-ready Enterprise Models • 11 items • Updated 11 days ago • 63

InternVL3

Collection

34 items • Updated Apr 20 • 79

Gemma 3 QAT

Collection

Quantization Aware Trained (QAT) Gemma 3 checkpoints. The model preserves similar quality as half precision while using 3x less memory • 15 items • Updated Jul 10 • 208

upvoted a collection 11 months ago

Llama 3.2

Collection

This collection hosts the transformers and original repos of the Llama 3.2 and Llama Guard 3 • 15 items • Updated Dec 6, 2024 • 628

upvoted a paper 11 months ago

Training Language Models to Self-Correct via Reinforcement Learning

Paper • 2409.12917 • Published Sep 19, 2024 • 141

upvoted 2 collections 12 months ago

Hermes 3

Collection

The Hermes 3 Series of Models • 11 items • Updated 20 days ago • 126

Minitron

Collection

A family of compressed models obtained via pruning and knowledge distillation • 12 items • Updated 21 days ago • 61

upvoted a paper 12 months ago

Mutual Reasoning Makes Smaller LLMs Stronger Problem-Solvers

Paper • 2408.06195 • Published Aug 12, 2024 • 74

upvoted a collection about 1 year ago

Gemma 2 2B Release

Collection

The 2.6B parameter version of Gemma 2. • 6 items • Updated Jul 10 • 81

upvoted an article about 1 year ago

Article

Llama 3.1 - 405B, 70B & 8B with multilinguality and long context

By

and 7 others •

Jul 23, 2024

• 237

Cedric Chee PRO

AI & ML interests

Organizations

DeepSeek-R1

Scaling Reasoning can Improve Factuality in Large Language Models

Beyond 'Aha!': Toward Systematic Meta-Abilities Alignment in Large Reasoning Models

Insights into DeepSeek-V3: Scaling Challenges and Reflections on Hardware for AI Architectures

Gemma 3 Release

Qwen3

GLM-4-0414

Kimi-VL-A3B

OLMo 2

Hermes 3 Technical Report

Llama Nemotron

InternVL3

Gemma 3 QAT

Llama 3.2

Training Language Models to Self-Correct via Reinforcement Learning

Hermes 3

Minitron

Mutual Reasoning Makes Smaller LLMs Stronger Problem-Solvers

Gemma 2 2B Release

Llama 3.1 - 405B, 70B & 8B with multilinguality and long context

Cedric Chee PRO

AI & ML interests

Organizations

cedric's activity

Llama 3.1 - 405B, 70B & 8B with multilinguality and long context