Spaces:
Sleeping
Sleeping
# LASER: application to cross-lingual natural language inference | |
This codes shows how to use the multilingual sentence embedding for | |
cross-lingual NLI, using the XNLI corpus. | |
We train a NLI classifier on the English MultiNLI corpus, optimizing | |
the meta-parameters on the English XNLI development corpus. | |
We then apply that classifier to the test set for all 14 transfer languages. | |
The foreign languages development set is not used. | |
## Installation | |
Just run `bash ./xnli.sh` | |
which install XNLI and MultiNLI corpora, | |
calculates the multilingual sentence embeddings, | |
trains the classifier and displays results. | |
The XNLI corpus is available [here](https://www.nyu.edu/projects/bowman/xnli/). | |
## Results | |
You should get the following results for zero-short cross-lingual transfer. | |
They slightly differ from those published in the initial version of the paper [2] | |
due to the change to PyTorch 1.0 and variations in random number generation, new optimization of meta-parameters, etc. | |
| en | fr | es | de | el | bg | ru | tr | ar | vi | th | zh | hi | sw | ur | | |
|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------| | |
| 74.65 | 72.26 | 73.15 | 72.48 | 72.73 | 73.35 | 71.08 | 69.84 | 70.48 | 71.94 | 69.20 | 71.38 | 65.95 | 62.14 | 61.82 | | |
All numbers are accuracies on the test set | |
## References | |
Details on the corpus are described in this paper: | |
[1] Alexis Conneau, Guillaume Lample, Ruty Rinott, Adina Williams, Samuel R. Bowman, Holger Schwenk and Veselin Stoyanov, | |
[*XNLI: Cross-lingual Sentence Understanding through Inference*](https://aclweb.org/anthology/D18-1269), | |
EMNLP, 2018. | |
Detailed system description: | |
[2] Mikel Artetxe and Holger Schwenk, | |
[*Massively Multilingual Sentence Embeddings for Zero-Shot Cross-Lingual Transfer and Beyond*](https://arxiv.org/pdf/1812.10464), | |
arXiv, Dec 26 2018. | |