Add pipeline tag and transformers library (#2)
Browse files- Add pipeline tag and transformers library (494c2c425aad2c656f116760d17ecfe1d36f3ddf)
Co-authored-by: Niels Rogge <[email protected]>
README.md
CHANGED
|
@@ -1,20 +1,20 @@
|
|
| 1 |
---
|
| 2 |
-
license: apache-2.0
|
| 3 |
-
language:
|
| 4 |
-
- en
|
| 5 |
base_model:
|
| 6 |
- sfairXC/FsfairX-LLaMA3-RM-v0.1
|
|
|
|
|
|
|
|
|
|
| 7 |
tags:
|
| 8 |
- reward model
|
| 9 |
- fine-grained
|
|
|
|
|
|
|
| 10 |
---
|
| 11 |
|
| 12 |
# MDCureRM
|
| 13 |
|
| 14 |
-
|
| 15 |
[π Paper](https://arxiv.org/pdf/2410.23463) | [π€ HF Collection](https://huggingface.co/collections/yale-nlp/mdcure-6724914875e87f41e5445395) | [βοΈ GitHub Repo](https://github.com/yale-nlp/MDCure)
|
| 16 |
|
| 17 |
-
|
| 18 |
## Introduction
|
| 19 |
|
| 20 |
**MDCure** is an effective and scalable procedure for generating high-quality multi-document (MD) instruction tuning data to improve MD capabilities of LLMs. Using MDCure, we construct a suite of MD instruction datasets complementary to collections such as [FLAN](https://github.com/google-research/FLAN) and fine-tune a variety of already instruction-tuned LLMs from the FlanT5, Qwen2, and LLAMA3.1 model families, up to 70B parameters in size. We additionally introduce **MDCureRM**, an evaluator model specifically designed for the MD setting to filter and select high-quality MD instruction data in a cost-effective, RM-as-a-judge fashion. Extensive evaluations on a wide range of MD and long-context benchmarks spanning various tasks show MDCure consistently improves performance over pre-trained baselines and over corresponding base models by up to 75.5%.
|
|
@@ -113,10 +113,16 @@ reward_weights = torch.tensor([1/9, 1/9, 1/9, 2/9, 2/9, 2/9], device="cuda")
|
|
| 113 |
source_text_1 = ...
|
| 114 |
source_text_2 = ...
|
| 115 |
source_text_3 = ...
|
| 116 |
-
context = f"{source_text_1}
|
|
|
|
|
|
|
|
|
|
|
|
|
| 117 |
instruction = "What happened in CHAMPAIGN regarding Lovie Smith and the 2019 defense improvements? Respond with 1-2 sentences."
|
| 118 |
|
| 119 |
-
input_text = f"Instruction: {instruction}
|
|
|
|
|
|
|
| 120 |
tokenized_input = tokenizer(
|
| 121 |
input_text,
|
| 122 |
return_tensors='pt',
|
|
@@ -141,7 +147,7 @@ Beyond MDCureRM, we open-source our best MDCure'd models at the following links:
|
|
| 141 |
| **MDCure-Qwen2-1.5B-Instruct** | [π€ HF Repo](https://huggingface.co/yale-nlp/MDCure-Qwen2-1.5B-Instruct) | **Qwen2-1.5B-Instruct** fine-tuned with MDCure-72k |
|
| 142 |
| **MDCure-Qwen2-7B-Instruct** | [π€ HF Repo](https://huggingface.co/yale-nlp/MDCure-Qwen2-7B-Instruct) | **Qwen2-7B-Instruct** fine-tuned with MDCure-72k |
|
| 143 |
| **MDCure-LLAMA3.1-8B-Instruct** | [π€ HF Repo](https://huggingface.co/yale-nlp/MDCure-LLAMA3.1-8B-Instruct) | **LLAMA3.1-8B-Instruct** fine-tuned with MDCure-72k |
|
| 144 |
-
| **MDCure-LLAMA3.1-70B-Instruct** | [π€ HF Repo](https://huggingface.co/yale-nlp/MDCure-LLAMA3.1-70B-Instruct) | **LLAMA3.1-70B-Instruct** fine-tuned with MDCure-
|
| 145 |
|
| 146 |
## Citation
|
| 147 |
|
|
|
|
| 1 |
---
|
|
|
|
|
|
|
|
|
|
| 2 |
base_model:
|
| 3 |
- sfairXC/FsfairX-LLaMA3-RM-v0.1
|
| 4 |
+
language:
|
| 5 |
+
- en
|
| 6 |
+
license: apache-2.0
|
| 7 |
tags:
|
| 8 |
- reward model
|
| 9 |
- fine-grained
|
| 10 |
+
pipeline_tag: text-ranking
|
| 11 |
+
library_name: transformers
|
| 12 |
---
|
| 13 |
|
| 14 |
# MDCureRM
|
| 15 |
|
|
|
|
| 16 |
[π Paper](https://arxiv.org/pdf/2410.23463) | [π€ HF Collection](https://huggingface.co/collections/yale-nlp/mdcure-6724914875e87f41e5445395) | [βοΈ GitHub Repo](https://github.com/yale-nlp/MDCure)
|
| 17 |
|
|
|
|
| 18 |
## Introduction
|
| 19 |
|
| 20 |
**MDCure** is an effective and scalable procedure for generating high-quality multi-document (MD) instruction tuning data to improve MD capabilities of LLMs. Using MDCure, we construct a suite of MD instruction datasets complementary to collections such as [FLAN](https://github.com/google-research/FLAN) and fine-tune a variety of already instruction-tuned LLMs from the FlanT5, Qwen2, and LLAMA3.1 model families, up to 70B parameters in size. We additionally introduce **MDCureRM**, an evaluator model specifically designed for the MD setting to filter and select high-quality MD instruction data in a cost-effective, RM-as-a-judge fashion. Extensive evaluations on a wide range of MD and long-context benchmarks spanning various tasks show MDCure consistently improves performance over pre-trained baselines and over corresponding base models by up to 75.5%.
|
|
|
|
| 113 |
source_text_1 = ...
|
| 114 |
source_text_2 = ...
|
| 115 |
source_text_3 = ...
|
| 116 |
+
context = f"{source_text_1}
|
| 117 |
+
|
| 118 |
+
{source_text_2}
|
| 119 |
+
|
| 120 |
+
{source_text_3}"
|
| 121 |
instruction = "What happened in CHAMPAIGN regarding Lovie Smith and the 2019 defense improvements? Respond with 1-2 sentences."
|
| 122 |
|
| 123 |
+
input_text = f"Instruction: {instruction}
|
| 124 |
+
|
| 125 |
+
{context}"
|
| 126 |
tokenized_input = tokenizer(
|
| 127 |
input_text,
|
| 128 |
return_tensors='pt',
|
|
|
|
| 147 |
| **MDCure-Qwen2-1.5B-Instruct** | [π€ HF Repo](https://huggingface.co/yale-nlp/MDCure-Qwen2-1.5B-Instruct) | **Qwen2-1.5B-Instruct** fine-tuned with MDCure-72k |
|
| 148 |
| **MDCure-Qwen2-7B-Instruct** | [π€ HF Repo](https://huggingface.co/yale-nlp/MDCure-Qwen2-7B-Instruct) | **Qwen2-7B-Instruct** fine-tuned with MDCure-72k |
|
| 149 |
| **MDCure-LLAMA3.1-8B-Instruct** | [π€ HF Repo](https://huggingface.co/yale-nlp/MDCure-LLAMA3.1-8B-Instruct) | **LLAMA3.1-8B-Instruct** fine-tuned with MDCure-72k |
|
| 150 |
+
| **MDCure-LLAMA3.1-70B-Instruct** | [π€ HF Repo](https://huggingface.co/yale-nlp/MDCure-LLAMA3.1-70B-Instruct) | **LLAMA3.1-70B-Instruct** fine-tuned with MDCure-72k |
|
| 151 |
|
| 152 |
## Citation
|
| 153 |
|