Spaces:

nvidia
/

Plan2Align-NV

Sleeping

App Files Files Community

Plan2Align-NV / laser /tasks /xnli /README.md

KuangDW

Add laser2.spm using Git LFS

05d3571 about 1 month ago

preview code

raw

history blame contribute delete

1.92 kB

	# LASER: application to cross-lingual natural language inference

	This codes shows how to use the multilingual sentence embedding for
	cross-lingual NLI, using the XNLI corpus.

	We train a NLI classifier on the English MultiNLI corpus, optimizing
	the meta-parameters on the English XNLI development corpus.
	We then apply that classifier to the test set for all 14 transfer languages.
	The foreign languages development set is not used.

	## Installation

	Just run `bash ./xnli.sh`
	which install XNLI and MultiNLI corpora,
	calculates the multilingual sentence embeddings,
	trains the classifier and displays results.

	The XNLI corpus is available [here](https://www.nyu.edu/projects/bowman/xnli/).

	## Results

	You should get the following results for zero-short cross-lingual transfer.
	They slightly differ from those published in the initial version of the paper [2]
	due to the change to PyTorch 1.0 and variations in random number generation, new optimization of meta-parameters, etc.

	\| en \| fr \| es \| de \| el \| bg \| ru \| tr \| ar \| vi \| th \| zh \| hi \| sw \| ur \|
	\|-------\|-------\|-------\|-------\|-------\|-------\|-------\|-------\|-------\|-------\|-------\|-------\|-------\|-------\|-------\|
	\| 74.65 \| 72.26 \| 73.15 \| 72.48 \| 72.73 \| 73.35 \| 71.08 \| 69.84 \| 70.48 \| 71.94 \| 69.20 \| 71.38 \| 65.95 \| 62.14 \| 61.82 \|

	All numbers are accuracies on the test set

	## References

	Details on the corpus are described in this paper:

	[1] Alexis Conneau, Guillaume Lample, Ruty Rinott, Adina Williams, Samuel R. Bowman, Holger Schwenk and Veselin Stoyanov,
	[XNLI: Cross-lingual Sentence Understanding through Inference](https://aclweb.org/anthology/D18-1269),
	EMNLP, 2018.

	Detailed system description:

	[2] Mikel Artetxe and Holger Schwenk,
	[Massively Multilingual Sentence Embeddings for Zero-Shot Cross-Lingual Transfer and Beyond](https://arxiv.org/pdf/1812.10464),
	arXiv, Dec 26 2018.