Spaces:
Running
Running
Update common.py
Browse files
common.py
CHANGED
|
@@ -126,20 +126,16 @@ Judge Arena is specifically designed to assess AI models that function as evalua
|
|
| 126 |
<br><br>
|
| 127 |
# FAQ
|
| 128 |
|
| 129 |
-
|
| 130 |
-
|
| 131 |
We are big fans of what the LMSYS team have done with Chatbot Arena and fully credit them for the inspiration to develop this. We were looking for a dynamic leaderboard that graded on AI judge capabilities and didn't manage to find one, so we created Judge Arena. This UI is designed especially for evals; to match the format of the model-based eval prompts that you would use in your LLM evaluation / monitoring tool.
|
| 132 |
|
| 133 |
-
|
| 134 |
-
|
| 135 |
We have listed out our efforts to be fully transparent in the policies above. All of the code for this leaderboard is open-source and can be found on our [Github](https://github.com/atla-ai/judge-arena).
|
| 136 |
|
| 137 |
-
|
| 138 |
-
|
| 139 |
Atla currently funds this out of our own pocket. We are looking for API credits (with no strings attached) to support this effort - please get in touch if you or someone you know might be able to help.
|
| 140 |
|
| 141 |
-
|
| 142 |
-
|
| 143 |
We are training a general-purpose evaluator that you will soon be able to run in this Judge Arena. Our next step will be to open-source a powerful model that the community can use to run fast and accurate evaluations.
|
| 144 |
<br><br>
|
| 145 |
# Get in touch
|
|
|
|
| 126 |
<br><br>
|
| 127 |
# FAQ
|
| 128 |
|
| 129 |
+
**Isn't this the same as Chatbot Arena?**
|
|
|
|
| 130 |
We are big fans of what the LMSYS team have done with Chatbot Arena and fully credit them for the inspiration to develop this. We were looking for a dynamic leaderboard that graded on AI judge capabilities and didn't manage to find one, so we created Judge Arena. This UI is designed especially for evals; to match the format of the model-based eval prompts that you would use in your LLM evaluation / monitoring tool.
|
| 131 |
|
| 132 |
+
**Why should I trust this leaderboard?**
|
|
|
|
| 133 |
We have listed out our efforts to be fully transparent in the policies above. All of the code for this leaderboard is open-source and can be found on our [Github](https://github.com/atla-ai/judge-arena).
|
| 134 |
|
| 135 |
+
**Who funds this effort?**
|
|
|
|
| 136 |
Atla currently funds this out of our own pocket. We are looking for API credits (with no strings attached) to support this effort - please get in touch if you or someone you know might be able to help.
|
| 137 |
|
| 138 |
+
**What is Atla working on?**
|
|
|
|
| 139 |
We are training a general-purpose evaluator that you will soon be able to run in this Judge Arena. Our next step will be to open-source a powerful model that the community can use to run fast and accurate evaluations.
|
| 140 |
<br><br>
|
| 141 |
# Get in touch
|