Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Posts
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up

SimulBench

community
https://simulbench.github.io/
SimulBench
Activity Feed

AI & ML interests

None defined yet.

Recent Activity

yuchenlin  authored a paper about 1 month ago
CrossWordBench: Evaluating the Reasoning Capabilities of LLMs and LVLMs with Controllable Puzzle Generation
yuchenlin  authored a paper 3 months ago
Small Models Struggle to Learn from Strong Reasoners
yuchenlin  authored a paper 3 months ago
ZebraLogic: On the Scaling Limits of LLMs for Logical Reasoning
View all activity

Qi Jia's profile picture Bill Yuchen Lin's profile picture

spaces 1

pinned
Runtime error

SimulBench Leaderboard

🌀

Sep 13, 2024

models 0

None public yet

datasets 6

SimulBench/SimulBench-results

Viewer • Updated Jun 13, 2024 • 6.83k • 78

SimulBench/SimulBench

Viewer • Updated Jun 6, 2024 • 1.78k • 22

SimulBench/SimulBench_interactions

Viewer • Updated Jun 6, 2024 • 327 • 11

SimulBench/SimulBench-tasks

Viewer • Updated Jun 6, 2024 • 109 • 14 • 1

SimulBench/SimulBench-seed-tasks

Viewer • Updated Mar 15, 2024 • 547 • 49 • 1

SimulBench/SimulBench-results-old

Viewer • Updated Mar 15, 2024 • 2.74k • 19
Company
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs