Create README.md

Files changed (1) hide show

README.md ADDED Viewed

+---
+license: apache-2.0
+language:
+- en
+- zh
+base_model:
+- Qwen/Qwen2.5-7B-Instruct
+pipeline_tag: text-generation
+tags:
+- reasoning
+- reinforcement
+- learning
+- RLT
+- math
+- science
+- code
+---
+# Reinforcement-Learned Teacher Student 7B
+This repository contains a 7B parameter student model trained using the **Reinforcement-Learned Teachers (RLT)** pipeline introduced in our paper [Reinforcement Learning Teachers](https://arxiv.org/abs/2506.08388).
+## Model Description
+The 7B RLT student is distilled from a 7B Reinforcement-Learned Teacher, which has been explicitly trained to produce high-quality reasoning traces optimized for student distillation.
+## Usage
+The model was trained with supervised fine-tuning using the same hyperparameters, the system prompt, and the reasoning tags from [Li et al. 2025](https://arxiv.org/pdf/2502.07374).
+Evaluation was conducted using the [SkyThought](https://github.com/NovaSky-AI/SkyThought) library at commit `4bb8f3e`. Please refer to our [repository](https://github.com/SakanaAI/RLT) and [paper](https://arxiv.org/abs/2506.08388) for details and results.