Commit
·
0c85c74
1
Parent(s):
6318c88
updated readme
Browse files
README.md
CHANGED
@@ -285,7 +285,7 @@ pipeline_tag: zero-shot-classification
|
|
285 |
|
286 |
# Model Card for DeBERTa-v3-base-tasksource-nli
|
287 |
|
288 |
-
This is [DeBERTa-v3-base](https://hf.co/microsoft/deberta-v3-base) fine-tuned with multi-task learning on 600 tasks
|
289 |
This checkpoint has strong zero-shot validation performance on many tasks (e.g. 70% on WNLI), and can be used for:
|
290 |
- Zero-shot entailment-based classification pipeline (similar to bart-mnli), see [ZS].
|
291 |
- Natural language inference, and many other tasks with tasksource-adapters, see [TA]
|
@@ -299,40 +299,3 @@ classifier = pipeline("zero-shot-classification",model="Azma-AI/deberta-base-mul
|
|
299 |
text = "one day I will see the world"
|
300 |
candidate_labels = ['travel', 'cooking', 'dancing']
|
301 |
classifier(text, candidate_labels)
|
302 |
-
```
|
303 |
-
|
304 |
-
|
305 |
-
## Evaluation
|
306 |
-
This model ranked 1st among all models with the microsoft/deberta-v3-base architecture according to the IBM model recycling evaluation.
|
307 |
-
https://ibm.github.io/model-recycling/
|
308 |
-
|
309 |
-
### Software and training details
|
310 |
-
|
311 |
-
The model was trained on 600 tasks for 200k steps with a batch size of 384 and a peak learning rate of 2e-5. Training took 12 days on Nvidia A30 24GB gpu.
|
312 |
-
This is the shared model with the MNLI classifier on top. Each task had a specific CLS embedding, which is dropped 10% of the time to facilitate model use without it. All multiple-choice model used the same classification layers. For classification tasks, models shared weights if their labels matched.
|
313 |
-
|
314 |
-
|
315 |
-
https://github.com/sileod/tasksource/ \
|
316 |
-
https://github.com/sileod/tasknet/ \
|
317 |
-
Training code: https://colab.research.google.com/drive/1iB4Oxl9_B5W3ZDzXoWJN-olUbqLBxgQS?usp=sharing
|
318 |
-
|
319 |
-
# Citation
|
320 |
-
|
321 |
-
More details on this [article:](https://arxiv.org/abs/2301.05948)
|
322 |
-
```
|
323 |
-
@article{sileo2023tasksource,
|
324 |
-
title={tasksource: Structured Dataset Preprocessing Annotations for Frictionless Extreme Multi-Task Learning and Evaluation},
|
325 |
-
author={Sileo, Damien},
|
326 |
-
url= {https://arxiv.org/abs/2301.05948},
|
327 |
-
journal={arXiv preprint arXiv:2301.05948},
|
328 |
-
year={2023}
|
329 |
-
}
|
330 |
-
```
|
331 |
-
|
332 |
-
|
333 |
-
# Model Card Contact
|
334 |
-
|
335 | |
336 |
-
|
337 |
-
|
338 |
-
</details>
|
|
|
285 |
|
286 |
# Model Card for DeBERTa-v3-base-tasksource-nli
|
287 |
|
288 |
+
This is [DeBERTa-v3-base](https://hf.co/microsoft/deberta-v3-base) fine-tuned with multi-task learning on 600 tasks.
|
289 |
This checkpoint has strong zero-shot validation performance on many tasks (e.g. 70% on WNLI), and can be used for:
|
290 |
- Zero-shot entailment-based classification pipeline (similar to bart-mnli), see [ZS].
|
291 |
- Natural language inference, and many other tasks with tasksource-adapters, see [TA]
|
|
|
299 |
text = "one day I will see the world"
|
300 |
candidate_labels = ['travel', 'cooking', 'dancing']
|
301 |
classifier(text, candidate_labels)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|