iperbole commited on
Commit
5612e6b
·
verified ·
1 Parent(s): 462cf9a

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +45 -5
README.md CHANGED
@@ -5,13 +5,53 @@ language:
5
  - en
6
  ---
7
 
8
- # Mistral-7B-v0.1-Adapted-IT-FVT
9
 
10
- The Mistral-7B-v0.1-Adapted collection of large language models (LLMs) is a collection of adapted generative models in 7B (text in/text out).
11
 
12
- The Mistral-7B-v0.1-Adapted are adapted models from Mistral-7B-Base-v0.1
13
 
14
- Model developer: SapienzaNLP, ISTI-CNR, ILC-CNR
15
 
16
- Model Architecture: Mistral-7B-v0.1-Adapted is an auto-regressive language model that uses an optimized transformer architecture.
17
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
5
  - en
6
  ---
7
 
8
+ # Mistral-7B-v0.1-Adapted-Italian-Continual
9
 
10
+ The **Mistral-7B-v0.1-Adapted** collection of large language models (LLMs), is a collection of adapted generative models in 7B (text in/text out), adapted models from **Mistral-7B-Base-v0.1**.
11
 
12
+ *full_mistral_average_continual* is a continual trained model adapted
13
 
14
+ **Model developer:** SapienzaNLP, ISTI-CNR, ILC-CNR
15
 
16
+ **Model Architecture:** Mistral-7B-v0.1-Adapted is an auto-regressive language model that uses an optimized transformer architecture.
17
 
18
+ ## Data used for the adaptation
19
+
20
+ The **Mistral-7B-v0.1-Adapted** model are trained on a collection of Italian and English data extracted from [CulturaX](https://huggingface.co/datasets/uonlp/CulturaX).
21
+ The data are extracted to be skewed toward Italian language with a ration of one over four. Extracting the first 9B tokens from Italian part of CulturaX and the first 3B tokens from English part of CulturaX.
22
+
23
+
24
+ ## Use with Transformers
25
+
26
+ You can run conversational inference using the Transformers pipeline abstraction or by leveraging the Auto classes with the generate() function.
27
+
28
+ Make sure to update your transformers installation via pip install --upgrade transformers.
29
+
30
+ ```python
31
+ import transformers
32
+ import torch
33
+
34
+ model_id = "SemanticAlignment/full_mistral_average_continual"
35
+
36
+ pipeline = transformers.pipeline(
37
+ "text-generation", model=model_id, model_kwargs={"torch_dtype": torch.bfloat16}, device_map="auto"
38
+ )
39
+
40
+ pipeline("Cosa si può fare in una bella giornata di sole?")
41
+ ```
42
+
43
+ ## Citation
44
+
45
+ If you use any part of this work, please consider citing the paper as follows:
46
+
47
+ ```bibtex
48
+ @misc{moroni2025optimizingllmsitalianreducing,
49
+ title={Optimizing LLMs for Italian: Reducing Token Fertility and Enhancing Efficiency Through Vocabulary Adaptation},
50
+ author={Luca Moroni and Giovanni Puccetti and Pere-Lluis Huguet Cabot and Andrei Stefan Bejgu and Edoardo Barba and Alessio Miaschi and Felice Dell'Orletta and Andrea Esuli and Roberto Navigli},
51
+ year={2025},
52
+ eprint={2504.17025},
53
+ archivePrefix={arXiv},
54
+ primaryClass={cs.CL},
55
+ url={https://arxiv.org/abs/2504.17025},
56
+ }
57
+ ```