metadata
license: mit
library_name: transformers
pipeline_tag: text-generation
This repository contains the R1-Code-Interpreter models described in R1-Code-Interpreter: Training LLMs to Reason with Code via Supervised and Reinforcement Learning.
Our code is based on Llama-factory/VeRL/Search-R1 for the SFT and RL training and SymBench/BIG-Bench-Hard/reasoning-gym for datasets/benchmarks of reasoning/planning tasks.
Project page: https://huggingface.co/yongchao98 Code: https://github.com/yongchao98/R1-Code-Interpreter