ltgoslo commited on
Commit
9b48adf
·
verified ·
1 Parent(s): 506a498

Prominent links for the datasets

Browse files
Files changed (1) hide show
  1. README.md +9 -2
README.md CHANGED
@@ -6,9 +6,16 @@ colorTo: blue
6
  sdk: static
7
  pinned: false
8
  ---
 
9
 
10
  Our project name, HPLT, is an acronym for High Performance Language Technologies. We are aiming high at combining large quantities of data, a number of languages and high-performance computing to build powerful and efficient language and translation models. Another goal of HPLT is to publish the results of this project in a shared space with open licenses.
11
 
12
- https://hplt-project.org/
 
 
 
 
 
 
13
 
14
- *This project has received funding from the European Union’s Horizon Europe research and innovation programme under grant agreement No 101070350 and from UK Research and Innovation (UKRI) under the UK government’s Horizon Europe funding guarantee [grant number 10052546]*
 
6
  sdk: static
7
  pinned: false
8
  ---
9
+ https://hplt-project.org/
10
 
11
  Our project name, HPLT, is an acronym for High Performance Language Technologies. We are aiming high at combining large quantities of data, a number of languages and high-performance computing to build powerful and efficient language and translation models. Another goal of HPLT is to publish the results of this project in a shared space with open licenses.
12
 
13
+ - [HPLT datasets paper](https://arxiv.org/abs/2503.10267)
14
+ - Version 2 of the HPLT datasets (193 languages):
15
+ - https://hplt-project.org/datasets/v2.0
16
+ - https://hf.co/datasets/HPLT/HPLT2.0_cleaned
17
+ - Version 1.2 of the HPLT datasets (75 languages):
18
+ - https://hplt-project.org/datasets/v1.2
19
+ - https://huggingface.co/datasets/HPLT/hplt_monolingual_v1_2
20
 
21
+ *This project has received funding from the European Union’s Horizon Europe research and innovation programme under grant agreement No 101070350 and from UK Research and Innovation (UKRI) under the UK government’s Horizon Europe funding guarantee [grant number 10052546]*