Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
lmarena-ai
's Collections
SearchArena
Arena-Hard-Auto
Prompt-to-Leaderboard
Arena-Hard-Auto
updated
Apr 24
An automatic evaluation tool for LLMs.
Upvote
-
Running
3
3
Arena Hard Viewer
⚡
Browse and evaluate model judgments from benchmarks
lmarena-ai/arena-hard-auto
Updated
May 1
•
239
•
3
Upvote
-
Share collection
View history
Collection guide
Browse collections