evalita_llm_leaderboard / preprocess_models_output.py

Commit History

Added computation and display of the standard deviation across individual prompt accuracy values for each task
67324c2
Running

rzanoli commited on

Small changes
5a8f6c4

rzanoli commited on

Small changes
dbd3b18

rzanoli commited on

Add new scripts for model processing and tasks management
ad489d5

rzanoli commited on