Lorenzob
/

aurora-1.6b

Model card Files Files and versions

aurora-1.6b / README.md

Lorenzob's picture

Update README.md

d2f3049 verified 3 months ago

|

history blame contribute delete

1.31 kB

	---
	license: apache-2.0
	language:
	- it
	- en
	- pl
	- de
	- fr
	base_model:
	- nari-labs/Dia-1.6B
	pipeline_tag: text-to-speech
	tags:
	- speech
	- dia
	- text-to-speech
	- vocal
	- voice
	---

	# Aurora-1.6B: Multilingual Emotion and Singing TTS Model

	A fine-tuned version of Dia-1.6B trained on multilingual and singing datasets, with full emotion control and zero-shot voice cloning.

	## Features

	- Multilingual Support
	Natural speech in Italian, English, Polish, German, French, and more.
	- Emotion Control
	Use speaker tags or emotion tokens (e.g. `[S1]`, `[happy]`, `[sad]`) to modulate expressiveness.
	- Singing Capabilities
	Generate melodic vocals by providing singing prompts or style references.
	- Zero-Shot Voice Cloning
	Clone any speaker’s voice from a short audio sample.
	- Nonverbal Vocalizations
	Embed realistic effects like `(laughs)`, `(coughs)`, or `(sighs)` inline.

	## Usage

	```python
	from dia.model import Dia
	import soundfile as sf

	# Load the Aurora-1.6B model
	model = Dia.from_pretrained("Lorenzob/aurora-1.6b")

	# Generate a happy spoken line followed by singing
	text = "[S1][happy] Hello world! Now sing 'Happy Birthday to You'"
	audio = model.generate(text)

	# Save output at 44.1 kHz
	sf.write("output.wav", audio, 44100)