Edit Models filters

Apps

Docker Model Runner

Inference Providers

HF Inference API

Misc

Reasoning-Course

Inference Endpoints

text-generation-inference

4-bit precision

8-bit precision

text-embeddings-inference

Carbon Emissions

Mixture of Experts

Models

41

Full-text search

Active filters: Reasoning-Course

nharshavardhana/SmolGRPO-135M

Text Generation • 0.1B • Updated Mar 4 • 2

Lingyue1/SmolGRPO-135M

Text Generation • 0.1B • Updated Mar 5 • 4

t2190/SmolGRPO-135M

Text Generation • 0.1B • Updated Mar 6 • 6

t2190/GRPO_1

Text Generation • 0.5B • Updated Mar 12 • 3

kaweizhenpi/SmolGRPO-135M

Text Generation • 0.1B • Updated Mar 7 • 3

Shumatsurontek/SmolGRPO-135M

Text Generation • 0.1B • Updated Mar 9 • 4

skyimple/SmolGRPO-135M

Text Generation • 0.1B • Updated Mar 12 • 2 • 1

abdulsamad/SmolGRPO-135M

Text Generation • 0.1B • Updated Apr 6 • 3

tobrun/SmolLM2-135M-GRPO

Text Generation • 0.1B • Updated Mar 15 • 4

TharunSivamani/SmolGRPO-135M

Text Generation • 0.1B • Updated Mar 16 • 5

frascuchon/SmolGRPO-135M

Text Generation • 0.1B • Updated Mar 17 • 2

bhaveshgoel07/SmolGRPO-135M

Arushhh/SmolGRPO-135M

Text Generation • 0.1B • Updated Mar 24 • 4

czuo03/SmolGRPO-135M

Text Generation • 0.1B • Updated Mar 28 • 3

opria123/SmolGRPO-135M

Text Generation • 0.1B • Updated Apr 6 • 3

alonsosilva/SmolGRPO-135M

Text Generation • 0.1B • Updated Apr 8 • 3

alfredcs/gemma-3-12b-grpo-firstaid

garethpaul/SmolGRPO-135M

Text Generation • 0.1B • Updated May 8 • 3

Thabet/SmolGRPO-135M-learning

Text Generation • 0.1B • Updated May 10 • 3

jcollado/SmolGRPO-135M

Text Generation • 0.1B • Updated May 14 • 3

Brianpuz/SmolGRPO-135M

Text Generation • 0.1B • Updated May 19 • 5

yigitkucuk/tint-interact-sft-grpo

Text Generation • 0.4B • Updated May 19 • 3

koochikoo25/SmolGRPO-135M

Text Generation • 0.1B • Updated May 20 • 3

jackle33/SmolGRPO-135M

Text Generation • 0.1B • Updated May 22 • 5

alfredcs/torchrun-gemma-3-12b-grpo-icd10pcs-merged

Text Generation • 8B • Updated Jun 4 • 15

alfredcs/gemma-3-27b-grpo-med-merged

Image-Text-to-Text • Updated Jun 16 • 8

alfredcs/gemma-3-27b-firstaid-icd10-merged

Image-Text-to-Text • Updated Jun 19 • 4

mradermacher/gemma-3-27b-firstaid-icd10-merged-GGUF

28B • Updated 15 days ago • 93

jinlovespho/SmolGRPO-135M

Text Generation • 0.1B • Updated Jun 25 • 4

tariktuna/SmolGRPO-135M

Text Generation • 0.1B • Updated Jul 1 • 3