Reasoning-Benchmarks A collection of mutiple benchmarks for large reasoning model evaluation guanning/amc23 Viewer • Updated May 25, 2025 • 40 • 17 guanning/math Viewer • Updated Jun 12, 2025 • 12.5k • 23 guanning/aime24 Viewer • Updated May 25, 2025 • 30 • 19 guanning/aime25 Viewer • Updated May 25, 2025 • 30 • 14
Reasoning-Benchmarks A collection of mutiple benchmarks for large reasoning model evaluation guanning/amc23 Viewer • Updated May 25, 2025 • 40 • 17 guanning/math Viewer • Updated Jun 12, 2025 • 12.5k • 23 guanning/aime24 Viewer • Updated May 25, 2025 • 30 • 19 guanning/aime25 Viewer • Updated May 25, 2025 • 30 • 14