Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
Spaces:
Duplicated fromย
benediktstroebl/hal
agent-evals
/
core_leaderboard
like
0
Running
App
Files
Files
Community
Fetching metadata from the HF Docker repository...
01fb261
core_leaderboard
/
utils
Ctrl+K
Ctrl+K
3 contributors
History:
11 commits
benediktstroebl
added failure report and two new swebench variants
5a7e21a
10 months ago
data.py
Safe
9.47 kB
format update and added monitor llm client backend
10 months ago
pareto.py
Safe
1.34 kB
big update with raw predictions section and dropdowns that dynamically parse agents of current leaderboard
10 months ago
processing.py
Safe
6.27 kB
added failure report and two new swebench variants
10 months ago
viz.py
Safe
10.3 kB
added failure report and two new swebench variants
10 months ago