Back to old gradio sdk
Browse files
README.md
CHANGED
|
@@ -13,12 +13,12 @@ description: >
|
|
| 13 |
It uses log-probabilities of "Yes"/"No" tokens from a language model acting as a judge.
|
| 14 |
Based on the SPIQA benchmark: https://arxiv.org/pdf/2407.09413
|
| 15 |
sdk: gradio
|
| 16 |
-
sdk_version:
|
| 17 |
app_file: app.py
|
| 18 |
pinned: false
|
| 19 |
---
|
| 20 |
|
| 21 |
-
# Metric Card: L3Score
|
| 22 |
|
| 23 |
## 📌 Description
|
| 24 |
|
|
@@ -37,7 +37,7 @@ Answer in one word - Yes or No.
|
|
| 37 |
|
| 38 |
The model's **log-probabilities** for "Yes" and "No" tokens are used to compute the score.
|
| 39 |
|
| 40 |
-
### 🧮
|
| 41 |
|
| 42 |
Let $ l_{\text{yes}}$ and $ l_{\text{no}}$ be the log-probabilities of "Yes" and "No", respectively.
|
| 43 |
|
|
|
|
| 13 |
It uses log-probabilities of "Yes"/"No" tokens from a language model acting as a judge.
|
| 14 |
Based on the SPIQA benchmark: https://arxiv.org/pdf/2407.09413
|
| 15 |
sdk: gradio
|
| 16 |
+
sdk_version: 3.19.1
|
| 17 |
app_file: app.py
|
| 18 |
pinned: false
|
| 19 |
---
|
| 20 |
|
| 21 |
+
# 🦢 Metric Card: L3Score
|
| 22 |
|
| 23 |
## 📌 Description
|
| 24 |
|
|
|
|
| 37 |
|
| 38 |
The model's **log-probabilities** for "Yes" and "No" tokens are used to compute the score.
|
| 39 |
|
| 40 |
+
### 🧮 Scoring Logic
|
| 41 |
|
| 42 |
Let $ l_{\text{yes}}$ and $ l_{\text{no}}$ be the log-probabilities of "Yes" and "No", respectively.
|
| 43 |
|