Spaces:
Sleeping
Sleeping
Model,Score,95% CI | |
GPT-4 Omni,3.18,+0.06/-0.06 | |
GPT-4 Turbo,3.1,+0.06/-0.06 | |
Gemini 1.5 Pro,3.06,+0.07/-0.07 | |
Gemini 1.5 Flash,2.98,+0.07/-0.07 | |
Llama 3 70B,2.9,+0.07/-0.07 | |
Claude 3 Opus,2.86,+0.08/-0.08 | |
Claude 3 Sonnet,2.79,+0.08/-0.08 | |
Claude 3 Haiku,2.73,+0.08/-0.08 | |
Gemini 1.0 Pro,2.56,+0.07/-0.07 | |
Llama 3 8B,2.56,+0.07/-0.07 | |
GPT-3.5 Turbo,2.52,+0.08/-0.08 | |
Gemma 7B,2.14,+0.07/-0.07 | |
Gemma 2B,1.83,+0.16/-0.16 | |