metadata

title: CP-Bench Leaderboard
emoji: 🚀📑
colorFrom: green
colorTo: indigo
sdk: docker
pinned: true
license: apache-2.0

🚀 CP-Bench Leaderboard

This repository contains the leaderboard of the CP-Bench dataset.

📁 Structure

app.py — Launches the Gradio interface.
src/ — Contains the main logic for fetching and displaying leaderboard data.'
- config.py — Configuration for the leaderboard.
- eval.py — Evaluation logic for model submissions.
- hf_utils.py — Utilities file.
- ui.py — UI components for displaying the leaderboard.
- user_eval.py — The logic for the evaluation of submitted models, it can also be used to evaluate models locally.
README.md — (you are here)

🧠 How It Works

Users submit a .jsonl file with their generated models
The submission is uploaded to a storage repository (Hugging Face Hub).
An evaluation script is triggered, which:
- Loads the submission.
- Evaluates the models against the benchmark dataset.
- Computes metrics.
The results are stored and displayed on the leaderboard.

🛠️ Development

To run locally:

pip install -r requirements.txt
python app.py

If you wish to contribute or modify the leaderboard, feel free to open discussions or pull requests. For adding more modelling frameworks, please modify the src/user_eval.py file to include the execution code for the new framework.