yongchao98
/

R1-Code-Interpreter-3B

Model card Files Files and versions Community

R1-Code-Interpreter-3B / README.md

nielsr's picture

nielsr HF Staff

Improve model card with metadata and links

9bf8706 verified about 1 month ago

|

887 Bytes

metadata

license: mit
library_name: transformers
pipeline_tag: text-generation

This repository contains the R1-Code-Interpreter models described in R1-Code-Interpreter: Training LLMs to Reason with Code via Supervised and Reinforcement Learning.

Our code is based on Llama-factory/VeRL/Search-R1 for the SFT and RL training and SymBench/BIG-Bench-Hard/reasoning-gym for datasets/benchmarks of reasoning/planning tasks.

Project page: https://huggingface.co/yongchao98 Code: https://github.com/yongchao98/R1-Code-Interpreter