Spaces:
Sleeping
Sleeping
Update README.md
Browse files
README.md
CHANGED
@@ -8,13 +8,13 @@ sdk_version: 1.42.2
|
|
8 |
app_file: app.py
|
9 |
pinned: false
|
10 |
license: mit
|
11 |
-
short_description:
|
12 |
---
|
13 |
|
14 |
|
15 |
# AutoBench 1.0 Demo
|
16 |
|
17 |
-
This Space runs a
|
18 |
|
19 |
## Features
|
20 |
|
|
|
8 |
app_file: app.py
|
9 |
pinned: false
|
10 |
license: mit
|
11 |
+
short_description: Collective-Model-As-Judge LLM Benchmark
|
12 |
---
|
13 |
|
14 |
|
15 |
# AutoBench 1.0 Demo
|
16 |
|
17 |
+
This Space runs a Collective-Model-As-Judge LLM benchmark to compare different language models using Hugging Face's Inference API. This is a simplified version of Autobench 1.0 which relies on multiple inference providers to manage request load and a wider range of models (Anthropic, Grok, Nebius, OpenAI, Together AI, Vertex AI). For more advanced use, please refer to the AutoBench 1.0 repository.
|
18 |
|
19 |
## Features
|
20 |
|