NeuML
/

pubmedbert-base-embeddings

@@ -85,22 +85,22 @@ Performance of this model compared to the top base models on the [MTEB leaderboa
 The following datasets were used to evaluate model performance.
-- [PubMed QA](https://huggingface.co/datasets/pubmed_qa)
   - Subset: pqa_labeled, Split: train, Pair: (question, long_answer)
-- [PubMed Subset](https://huggingface.co/datasets/zxvix/pubmed_subset_new)
   - Split: test, Pair: (title, text)
-- [PubMed Summary](https://huggingface.co/datasets/scientific_papers)
   - Subset: pubmed, Split: validation, Pair: (article, abstract)
 Evaluation results are shown below. The [Pearson correlation coefficient](https://en.wikipedia.org/wiki/Pearson_correlation_coefficient) is used as the evaluation metric.
 | Model                                                                         | PubMed QA | PubMed Subset | PubMed Summary | Average   |
 | ----------------------------------------------------------------------------- | --------- | ------------- | -------------- | --------- |
-| [all-MiniLM-L6-v2](https://hf.co/sentence-transformers/all-MiniLM-L6-v2)      | 90.40     | 95.86         | 94.07          | 93.44     |
-| [bge-base-en-v1.5](https://hf.co/BAAI/bge-large-en-v1.5)                      | 91.02     | 95.60         | 94.49          | 93.70     |
-| [gte-base](https://hf.co/thenlper/gte-base)                                   | 92.97     | 96.83         | 96.24          | 95.35     |
-| [**pubmedbert-base-embeddings**](https://hf.co/neuml/pubmedbert-base-embeddings) | **93.27** | **97.07**     | **96.58**      | **95.64** |
-| [S-PubMedBert-MS-MARCO](https://hf.co/pritamdeka/S-PubMedBert-MS-MARCO)       | 90.86     | 93.33         | 93.54          | 92.58     |
 ## Training

 The following datasets were used to evaluate model performance.
+- [PubMed QA](https://huggingface.co/datasets/qiaojin/PubMedQA)
   - Subset: pqa_labeled, Split: train, Pair: (question, long_answer)
+- [PubMed Subset](https://huggingface.co/datasets/awinml/pubmed_abstract_3_1k)
   - Split: test, Pair: (title, text)
+- [PubMed Summary](https://huggingface.co/datasets/armanc/scientific_papers)
   - Subset: pubmed, Split: validation, Pair: (article, abstract)
 Evaluation results are shown below. The [Pearson correlation coefficient](https://en.wikipedia.org/wiki/Pearson_correlation_coefficient) is used as the evaluation metric.
 | Model                                                                         | PubMed QA | PubMed Subset | PubMed Summary | Average   |
 | ----------------------------------------------------------------------------- | --------- | ------------- | -------------- | --------- |
+| [all-MiniLM-L6-v2](https://hf.co/sentence-transformers/all-MiniLM-L6-v2)           | 90.40     | 95.92         | 94.07          | 93.46     |
+| [bge-base-en-v1.5](https://hf.co/BAAI/bge-base-en-v1.5)                            | 91.02     | 95.82         | 94.49          | 93.78     |
+| [gte-base](https://hf.co/thenlper/gte-base)                                        | 92.97     | 96.90         | 96.24          | 95.37     |
+| [**pubmedbert-base-embeddings**](https://hf.co/neuml/pubmedbert-base-embeddings) | **93.27** | **97.00**     | **96.58**      | **95.62** |
+| [S-PubMedBert-MS-MARCO](https://hf.co/pritamdeka/S-PubMedBert-MS-MARCO)            | 90.86     | 93.68         | 93.54          | 92.69     |
 ## Training