metadata

title: Table Markdown Metrics
emoji: 📊
colorFrom: blue
colorTo: red
sdk: gradio
sdk_version: 3.19.1
app_file: app.py
pinned: false
tags:
  - evaluate
  - metric
  - table
  - markdown
description: >-
  Table evaluation metrics for assessing the matching degree between predicted
  and reference tables. It calculates precision, recall, and F1 score for table
  data extraction or generation tasks.

Metric Card for Table Markdown Metrics

Metric Description

This metric evaluates the accuracy of table data extraction or generation by comparing predicted tables with reference tables. It calculates:

Precision: The ratio of correctly predicted cells to the total number of cells in the predicted table
Recall: The ratio of correctly predicted cells to the total number of cells in the reference table
F1 Score: The harmonic mean of precision and recall

How to Use

This metric requires predictions and references as inputs in Markdown table format.

>>> table_metric = evaluate.load("table_markdown")
>>> results = table_metric.compute(
...     predictions="|A|B|\n|1|2|",
...     references="|A|B|\n|1|3|"
... )
>>> print(results)
{'precision': 0.5, 'recall': 0.5, 'f1': 0.5, 'true_positives': 1, 'false_positives': 1, 'false_negatives': 1}

Inputs

predictions (str): Predicted table in Markdown format.
references (str): Reference table in Markdown format.

Output Values

precision (float): Precision score. Range: [0,1]
recall (float): Recall score. Range: [0,1]
f1 (float): F1 score. Range: [0,1]
true_positives (int): Number of correctly predicted cells
false_positives (int): Number of incorrectly predicted cells
false_negatives (int): Number of cells that were not predicted

Examples

Example 1 - Simple table comparison:

>>> table_metric = evaluate.load("table_markdown")
>>> results = table_metric.compute(
...     predictions="|  | lobby | search | band | charge | chain ||--|--|--|--|--|--|| desire | 5 | 8 | 7 | 5 | 9 || wage | 1 | 5 | 3 | 8 | 5 |",
...     references="|  | lobby | search | band | charge | chain ||--|--|--|--|--|--|| desire | 1 | 6 | 7 | 5 | 9 || wage | 1 | 5 | 2 | 8 | 5 |"
... )
>>> print(results)
{'precision': 0.7, 'recall': 0.7, 'f1': 0.7, 'true_positives': 7, 'false_positives': 3, 'false_negatives': 3}

Example 2 - Complex table comparison:

>>> table_metric = evaluate.load("table_markdown")
>>> results = table_metric.compute(
...     predictions="""
... |  | lobby | search | band |
... |--|-------|--------|------|
... | desire | 5 | 8 | 7 |
... | wage | 1 | 5 | 3 |
... """,
...     references="""
... |  | lobby | search | band |
... |--|-------|--------|------|
... | desire | 5 | 8 | 7 |
... | wage | 1 | 5 | 3 |
... """
... )
>>> print(results)
{'precision': 1.0, 'recall': 1.0, 'f1': 1.0, 'true_positives': 6, 'false_positives': 0, 'false_negatives': 0}

Limitations and Bias

The metric assumes that tables are well-formed in Markdown format
The comparison is case-sensitive
The metric does not handle merged cells or complex table structures
The metric treats each cell as a separate unit and does not consider the semantic meaning of the content

Citation(s)

@article{scikit-learn,
  title={Scikit-learn: Machine Learning in {P}ython},
  author={Pedregosa, F. and Varoquaux, G. and Gramfort, A. and Michel, V.
         and Thirion, B. and Grisel, O. and Blondel, M. and Prettenhofer, P.
         and Weiss, R. and Dubourg, V. and Vanderplas, J. and Passos, A. and
         Cournapeau, D. and Brucher, M. and Perrot, M. and Duchesnay, E.},
  journal={Journal of Machine Learning Research},
  volume={12},
  pages={2825--2830},
  year={2011}
}

Spaces:

maqiuping59
/

table_markdown

Running