Lorenzob commited on
Commit
d2f3049
·
verified ·
1 Parent(s): d924c46

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +31 -25
README.md CHANGED
@@ -1,45 +1,51 @@
1
  ---
2
  license: apache-2.0
3
  language:
4
- - it
5
- - en
6
- - pl
7
- - de
8
- - fr
9
  base_model:
10
- - nari-labs/Dia-1.6B
11
  pipeline_tag: text-to-speech
12
  tags:
13
- - speech
14
- - dia
15
- - text-to-speech
16
- - vocal
17
- - voice
18
  ---
19
- # Multilingual Emotion TTS Model
20
 
21
- A fine-tuned version of Dia-1.6B trained on multilingual datasets.
 
 
22
 
23
  ## Features
24
 
25
- - Support for multiple languages
26
- - Emotion control in speech generation
27
- - Singing capabilities
 
 
 
 
 
 
 
28
 
29
  ## Usage
30
 
31
  ```python
32
  from dia.model import Dia
33
- import torch
34
 
35
- # Carica il modello personalizzato
36
  model = Dia.from_pretrained("Lorenzob/aurora-1.6b")
37
 
38
- # Genera audio con emozione
39
- text = "[S1] Say 'Hello world' with a happy emotion"
40
- output = model.generate(text)
41
 
42
- # Salva l'audio
43
- import soundfile as sf
44
- sf.write("output.mp3", output, 44100)
45
- ```
 
1
  ---
2
  license: apache-2.0
3
  language:
4
+ - it
5
+ - en
6
+ - pl
7
+ - de
8
+ - fr
9
  base_model:
10
+ - nari-labs/Dia-1.6B
11
  pipeline_tag: text-to-speech
12
  tags:
13
+ - speech
14
+ - dia
15
+ - text-to-speech
16
+ - vocal
17
+ - voice
18
  ---
 
19
 
20
+ # Aurora-1.6B: Multilingual Emotion and Singing TTS Model
21
+
22
+ A fine-tuned version of Dia-1.6B trained on multilingual and singing datasets, with full emotion control and zero-shot voice cloning.
23
 
24
  ## Features
25
 
26
+ - **Multilingual Support**
27
+ Natural speech in Italian, English, Polish, German, French, and more.
28
+ - **Emotion Control**
29
+ Use speaker tags or emotion tokens (e.g. `[S1]`, `[happy]`, `[sad]`) to modulate expressiveness.
30
+ - **Singing Capabilities**
31
+ Generate melodic vocals by providing singing prompts or style references.
32
+ - **Zero-Shot Voice Cloning**
33
+ Clone any speaker’s voice from a short audio sample.
34
+ - **Nonverbal Vocalizations**
35
+ Embed realistic effects like `(laughs)`, `(coughs)`, or `(sighs)` inline.
36
 
37
  ## Usage
38
 
39
  ```python
40
  from dia.model import Dia
41
+ import soundfile as sf
42
 
43
+ # Load the Aurora-1.6B model
44
  model = Dia.from_pretrained("Lorenzob/aurora-1.6b")
45
 
46
+ # Generate a happy spoken line followed by singing
47
+ text = "[S1][happy] Hello world! Now sing 'Happy Birthday to You'"
48
+ audio = model.generate(text)
49
 
50
+ # Save output at 44.1 kHz
51
+ sf.write("output.wav", audio, 44100)