Edit Models filters

Tasks

Text Generation

Image-Text-to-Text

Parameters

Libraries

Transformers.js

Apps

Inference Providers

Models

1,150

Full-text search

Active filters: reinforcement-learning, transformers

MattBou00/llama-3-2-1b-detox_v1f_round4-checkpoint-epoch-20

Reinforcement Learning • 1B • Updated 5 days ago • 6

MattBou00/llama-3-2-1b-detox_v1f_round4-checkpoint-epoch-40

Reinforcement Learning • 1B • Updated 5 days ago • 6

MattBou00/llama-3-2-1b-detox_v1f_round4-checkpoint-epoch-60

Reinforcement Learning • 1B • Updated 5 days ago • 7

MattBou00/llama-3-2-1b-detox_v1f_round4-checkpoint-epoch-80

Reinforcement Learning • 1B • Updated 5 days ago • 6

MattBou00/llama-3-2-1b-detox_v1f_round4-checkpoint-epoch-100

Reinforcement Learning • 1B • Updated 5 days ago • 7

MattBou00/llama-3-2-1b-detox_v1f_round4

Reinforcement Learning • 1B • Updated 5 days ago • 6

MattBou00/llama-3-2-1b-detox_retry-checkpoint-epoch-20

Reinforcement Learning • 1B • Updated 2 days ago • 8

mradermacher/VeriReason-codeLlama-7b-RTLCoder-Verilog-GRPO-reasoning-tb-GGUF

Reinforcement Learning • 7B • Updated about 22 hours ago • 23

mradermacher/SLM-SQL-Base-1.3B-GGUF

Reinforcement Learning • 1B • Updated about 7 hours ago

mradermacher/SLM-SQL-Base-1B-GGUF

Reinforcement Learning • 1B • Updated about 5 hours ago