Spaces:
Runtime error
Runtime error
interface update
Browse files- src/about.py +1 -1
src/about.py
CHANGED
|
@@ -5,7 +5,7 @@ TITLE = """<h1 align="center" id="space-title">InstruSumEval Leaderboard</h1>"""
|
|
| 5 |
|
| 6 |
# What does your leaderboard evaluate?
|
| 7 |
INTRODUCTION_TEXT = """
|
| 8 |
-
- This leaderboard evaluates the *evaluation* capabilities of language models on the [
|
| 9 |
- InstruSum is a benchmark for instruction-controllable summarization, where the goal is to generate summaries that satisfy user-provided instructions.
|
| 10 |
- The benchmark contains human evaluations for the generated summaries, on which the models are evaluated as judges for *long-context* instruction-following.
|
| 11 |
|
|
|
|
| 5 |
|
| 6 |
# What does your leaderboard evaluate?
|
| 7 |
INTRODUCTION_TEXT = """
|
| 8 |
+
- This leaderboard evaluates the *evaluation* capabilities of language models on the [salesforce/instrusum](https://huggingface.co/datasets/Salesforce/InstruSum) benchmark from our paper ["Benchmarking Generation and Evaluation Capabilities of Large Language Models for Instruction Controllable Summarization"](https://arxiv.org/abs/2311.09184).
|
| 9 |
- InstruSum is a benchmark for instruction-controllable summarization, where the goal is to generate summaries that satisfy user-provided instructions.
|
| 10 |
- The benchmark contains human evaluations for the generated summaries, on which the models are evaluated as judges for *long-context* instruction-following.
|
| 11 |
|