Aswanth-Azma commited on
Commit
0c85c74
·
1 Parent(s): 6318c88

updated readme

Browse files
Files changed (1) hide show
  1. README.md +1 -38
README.md CHANGED
@@ -285,7 +285,7 @@ pipeline_tag: zero-shot-classification
285
 
286
  # Model Card for DeBERTa-v3-base-tasksource-nli
287
 
288
- This is [DeBERTa-v3-base](https://hf.co/microsoft/deberta-v3-base) fine-tuned with multi-task learning on 600 tasks of the [tasksource collection](https://github.com/sileod/tasksource/).
289
  This checkpoint has strong zero-shot validation performance on many tasks (e.g. 70% on WNLI), and can be used for:
290
  - Zero-shot entailment-based classification pipeline (similar to bart-mnli), see [ZS].
291
  - Natural language inference, and many other tasks with tasksource-adapters, see [TA]
@@ -299,40 +299,3 @@ classifier = pipeline("zero-shot-classification",model="Azma-AI/deberta-base-mul
299
  text = "one day I will see the world"
300
  candidate_labels = ['travel', 'cooking', 'dancing']
301
  classifier(text, candidate_labels)
302
- ```
303
-
304
-
305
- ## Evaluation
306
- This model ranked 1st among all models with the microsoft/deberta-v3-base architecture according to the IBM model recycling evaluation.
307
- https://ibm.github.io/model-recycling/
308
-
309
- ### Software and training details
310
-
311
- The model was trained on 600 tasks for 200k steps with a batch size of 384 and a peak learning rate of 2e-5. Training took 12 days on Nvidia A30 24GB gpu.
312
- This is the shared model with the MNLI classifier on top. Each task had a specific CLS embedding, which is dropped 10% of the time to facilitate model use without it. All multiple-choice model used the same classification layers. For classification tasks, models shared weights if their labels matched.
313
-
314
-
315
- https://github.com/sileod/tasksource/ \
316
- https://github.com/sileod/tasknet/ \
317
- Training code: https://colab.research.google.com/drive/1iB4Oxl9_B5W3ZDzXoWJN-olUbqLBxgQS?usp=sharing
318
-
319
- # Citation
320
-
321
- More details on this [article:](https://arxiv.org/abs/2301.05948)
322
- ```
323
- @article{sileo2023tasksource,
324
- title={tasksource: Structured Dataset Preprocessing Annotations for Frictionless Extreme Multi-Task Learning and Evaluation},
325
- author={Sileo, Damien},
326
- url= {https://arxiv.org/abs/2301.05948},
327
- journal={arXiv preprint arXiv:2301.05948},
328
- year={2023}
329
- }
330
- ```
331
-
332
-
333
- # Model Card Contact
334
-
335
336
-
337
-
338
- </details>
 
285
 
286
  # Model Card for DeBERTa-v3-base-tasksource-nli
287
 
288
+ This is [DeBERTa-v3-base](https://hf.co/microsoft/deberta-v3-base) fine-tuned with multi-task learning on 600 tasks.
289
  This checkpoint has strong zero-shot validation performance on many tasks (e.g. 70% on WNLI), and can be used for:
290
  - Zero-shot entailment-based classification pipeline (similar to bart-mnli), see [ZS].
291
  - Natural language inference, and many other tasks with tasksource-adapters, see [TA]
 
299
  text = "one day I will see the world"
300
  candidate_labels = ['travel', 'cooking', 'dancing']
301
  classifier(text, candidate_labels)