Rethinking OPD - a lllyx Collection

lllyx 's Collections

updated 5 days ago

This collection includes the models used in the paper "Rethinking On-Policy Distillation of Large Language Models: Phenomenology, Mechanism, and Recip

Upvote

lllyx/Qwen3-1.7B-SFT

Text Generation • 2B • Updated 5 days ago • 642 • 3
Rethinking On-Policy Distillation of Large Language Models: Phenomenology, Mechanism, and Recipe

Paper • 2604.13016 • Published Apr 14 • 95
lllyx/Qwen3-4B-Base-GRPO

Text Generation • 4B • Updated 14 days ago • 182 • 2
lllyx/OpenThought3-Qwen3-4B

Viewer • Updated 5 days ago • 305k • 89 • 2

Upvote

Collection guide
Browse collections