Spaces:
Running
Running
Update Space (evaluate main: 05209ece)
Browse files
README.md
CHANGED
|
@@ -10,6 +10,27 @@ pinned: false
|
|
| 10 |
tags:
|
| 11 |
- evaluate
|
| 12 |
- metric
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 13 |
---
|
| 14 |
|
| 15 |
# Metric Card for SARI
|
|
|
|
| 10 |
tags:
|
| 11 |
- evaluate
|
| 12 |
- metric
|
| 13 |
+
description: >-
|
| 14 |
+
SARI is a metric used for evaluating automatic text simplification systems.
|
| 15 |
+
The metric compares the predicted simplified sentences against the reference
|
| 16 |
+
and the source sentences. It explicitly measures the goodness of words that are
|
| 17 |
+
added, deleted and kept by the system.
|
| 18 |
+
Sari = (F1_add + F1_keep + P_del) / 3
|
| 19 |
+
where
|
| 20 |
+
F1_add: n-gram F1 score for add operation
|
| 21 |
+
F1_keep: n-gram F1 score for keep operation
|
| 22 |
+
P_del: n-gram precision score for delete operation
|
| 23 |
+
n = 4, as in the original paper.
|
| 24 |
+
|
| 25 |
+
This implementation is adapted from Tensorflow's tensor2tensor implementation [3].
|
| 26 |
+
It has two differences with the original GitHub [1] implementation:
|
| 27 |
+
(1) Defines 0/0=1 instead of 0 to give higher scores for predictions that match
|
| 28 |
+
a target exactly.
|
| 29 |
+
(2) Fixes an alleged bug [2] in the keep score computation.
|
| 30 |
+
[1] https://github.com/cocoxu/simplification/blob/master/SARI.py
|
| 31 |
+
(commit 0210f15)
|
| 32 |
+
[2] https://github.com/cocoxu/simplification/issues/6
|
| 33 |
+
[3] https://github.com/tensorflow/tensor2tensor/blob/master/tensor2tensor/utils/sari_hook.py
|
| 34 |
---
|
| 35 |
|
| 36 |
# Metric Card for SARI
|