Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up

Spaces:

Duplicated fromย  benediktstroebl/hal

agent-evals
/
core_leaderboard
Running

App Files Files Community
Fetching metadata from the HF Docker repository...
core_leaderboard / utils
Ctrl+K
Ctrl+K
  • 3 contributors
History: 11 commits
benediktstroebl
added failure report and two new swebench variants
5a7e21a 10 months ago
  • data.py
    9.47 kB
    format update and added monitor llm client backend 10 months ago
  • pareto.py
    1.34 kB
    big update with raw predictions section and dropdowns that dynamically parse agents of current leaderboard 10 months ago
  • processing.py
    6.27 kB
    added failure report and two new swebench variants 10 months ago
  • viz.py
    10.3 kB
    added failure report and two new swebench variants 10 months ago