InstruSumEval

Runtime error

henryL7 commited on May 17, 2024

Commit

9a1d7c7

1 Parent(s): 07c333a

interface update

Files changed (1) hide show

src/about.py CHANGED Viewed

@@ -5,7 +5,7 @@ TITLE = """<h1 align="center" id="space-title">InstruSumEval Leaderboard</h1>"""
 # What does your leaderboard evaluate?
 INTRODUCTION_TEXT = """
-- This leaderboard evaluates the *evaluation* capabilities of language models on the [InstruSum](https://huggingface.co/datasets/Salesforce/InstruSum) benchmark from our paper ["Benchmarking Generation and Evaluation Capabilities of Large Language Models for Instruction Controllable Summarization"](https://arxiv.org/abs/2311.09184).
 - InstruSum is a benchmark for instruction-controllable summarization, where the goal is to generate summaries that satisfy user-provided instructions.
 - The benchmark contains human evaluations for the generated summaries, on which the models are evaluated as judges for *long-context* instruction-following.

 # What does your leaderboard evaluate?
 INTRODUCTION_TEXT = """
+- This leaderboard evaluates the *evaluation* capabilities of language models on the [salesforce/instrusum](https://huggingface.co/datasets/Salesforce/InstruSum) benchmark from our paper ["Benchmarking Generation and Evaluation Capabilities of Large Language Models for Instruction Controllable Summarization"](https://arxiv.org/abs/2311.09184).
 - InstruSum is a benchmark for instruction-controllable summarization, where the goal is to generate summaries that satisfy user-provided instructions.
 - The benchmark contains human evaluations for the generated summaries, on which the models are evaluated as judges for *long-context* instruction-following.