Open-Reasoner-Zero/Open-Reasoner-Zero-Critic-1.5B Reinforcement Learning • 2B • Updated Apr 6 • 4 • 1
Open-Reasoner-Zero/Open-Reasoner-Zero-Critic-32B Reinforcement Learning • 32B • Updated Apr 7 • 5 • 5
NousResearch/DeepHermes-Egregore-v1-RLAIF-8b-Atropos Reinforcement Learning • 8B • Updated Apr 29 • 29 • 2
NousResearch/DeepHermes-Egregore-v2-RLAIF-8b-Atropos Reinforcement Learning • 8B • Updated Apr 29 • 26 • 5
NousResearch/DeepHermes-AscensionMaze-RLAIF-8b-Atropos Reinforcement Learning • 8B • Updated Apr 29 • 30 • 6
NousResearch/DeepHermes-ToolCalling-Specialist-Atropos Reinforcement Learning • 8B • Updated Apr 28 • 580 • 14
malifnasrulloh/PPO-IndoNanoT5-base-Liputan6-Canonical Reinforcement Learning • 0.2B • Updated Apr 15 • 2
NousResearch/DeepHermes-Egregore-v2-RLAIF-8b-Atropos-GGUF Reinforcement Learning • 8B • Updated May 5 • 54 • 2
NousResearch/DeepHermes-Egregore-v1-RLAIF-8b-Atropos-GGUF Reinforcement Learning • 8B • Updated May 5 • 36 • 3