Spaces:
Running
Running
Update Space (evaluate main: 05209ece)
Browse files
README.md
CHANGED
|
@@ -10,6 +10,17 @@ pinned: false
|
|
| 10 |
tags:
|
| 11 |
- evaluate
|
| 12 |
- metric
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 13 |
---
|
| 14 |
|
| 15 |
# Metric Card for SQuAD v2
|
|
|
|
| 10 |
tags:
|
| 11 |
- evaluate
|
| 12 |
- metric
|
| 13 |
+
description: >-
|
| 14 |
+
This metric wrap the official scoring script for version 2 of the Stanford Question Answering Dataset (SQuAD).
|
| 15 |
+
|
| 16 |
+
Stanford Question Answering Dataset (SQuAD) is a reading comprehension dataset, consisting of questions posed by
|
| 17 |
+
crowdworkers on a set of Wikipedia articles, where the answer to every question is a segment of text, or span,
|
| 18 |
+
from the corresponding reading passage, or the question might be unanswerable.
|
| 19 |
+
|
| 20 |
+
SQuAD2.0 combines the 100,000 questions in SQuAD1.1 with over 50,000 unanswerable questions
|
| 21 |
+
written adversarially by crowdworkers to look similar to answerable ones.
|
| 22 |
+
To do well on SQuAD2.0, systems must not only answer questions when possible, but also
|
| 23 |
+
determine when no answer is supported by the paragraph and abstain from answering.
|
| 24 |
---
|
| 25 |
|
| 26 |
# Metric Card for SQuAD v2
|