miya3333 commited on
Commit
b725758
·
verified ·
1 Parent(s): c074eb9

Upload README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -135
README.md CHANGED
@@ -1,5 +1,5 @@
1
  ---
2
- title: Test Space of SpeechBrain TTS API
3
  emoji: 🗣️
4
  colorFrom: blue
5
  colorTo: purple
@@ -10,137 +10,4 @@ pinned: false
10
  license: mit
11
  ---
12
 
13
- # Test Space of SpeechBrain TTS API
14
-
15
- This is test of A Text-to-Speech API built with FastAPI and SpeechBrain, running on Hugging Face Spaces.
16
-
17
- ## Features
18
-
19
- - 🎯 **Fast TTS synthesis** using SpeechBrain's Tacotron2 + HiFiGAN
20
- - 🔄 **Multiple output formats** (WAV stream and Base64)
21
- - 📝 **Simple REST API** with JSON requests
22
- - 🚀 **Ready for deployment** on Hugging Face Spaces
23
-
24
- ## API Endpoints
25
-
26
- ### `GET /`
27
- Returns API status message.
28
-
29
- ### `GET /health`
30
- Health check endpoint.
31
-
32
- ### `POST /synthesize`
33
- Synthesizes speech and returns audio as WAV stream.
34
-
35
- **Request:**
36
- ```json
37
- {
38
- "text": "Hello, this is a test message",
39
- "sample_rate": 22050
40
- }
41
- ```
42
-
43
- **Response:** WAV audio file stream
44
-
45
- ### `POST /synthesize_base64`
46
- Synthesizes speech and returns audio as Base64 encoded string.
47
-
48
- **Request:**
49
- ```json
50
- {
51
- "text": "Hello, this is a test message",
52
- "sample_rate": 22050
53
- }
54
- ```
55
-
56
- **Response:**
57
- ```json
58
- {
59
- "audio_base64": "UklGRkq...",
60
- "sample_rate": 22050,
61
- "text": "Hello, this is a test message"
62
- }
63
- ```
64
-
65
- ## Usage Examples
66
-
67
- ### Python Client
68
- ```python
69
- import requests
70
- import base64
71
- from io import BytesIO
72
-
73
- # Text to synthesize
74
- text = "Hello world, this is a speech synthesis test."
75
-
76
- # Request to Base64 endpoint
77
- response = requests.post(
78
- "https://your-space-url.hf.space/synthesize_base64",
79
- json={"text": text, "sample_rate": 22050}
80
- )
81
-
82
- if response.status_code == 200:
83
- result = response.json()
84
-
85
- # Decode Base64 audio
86
- audio_data = base64.b64decode(result["audio_base64"])
87
-
88
- # Save as WAV file
89
- with open("output.wav", "wb") as f:
90
- f.write(audio_data)
91
-
92
- print("Audio saved as output.wav")
93
- ```
94
-
95
- ### JavaScript Client
96
- ```javascript
97
- async function synthesizeSpeech(text) {
98
- const response = await fetch('/synthesize_base64', {
99
- method: 'POST',
100
- headers: {
101
- 'Content-Type': 'application/json',
102
- },
103
- body: JSON.stringify({
104
- text: text,
105
- sample_rate: 22050
106
- })
107
- });
108
-
109
- if (response.ok) {
110
- const result = await response.json();
111
-
112
- // Create audio element
113
- const audio = new Audio();
114
- audio.src = `data:audio/wav;base64,${result.audio_base64}`;
115
- audio.play();
116
- }
117
- }
118
-
119
- // Usage
120
- synthesizeSpeech("Hello from JavaScript!");
121
- ```
122
-
123
- ## Local Development
124
-
125
- 1. Install dependencies:
126
- ```bash
127
- pip install -r requirements.txt
128
- ```
129
-
130
- 2. Run the server:
131
- ```bash
132
- python app.py
133
- ```
134
-
135
- 3. The API will be available at `http://localhost:7860`
136
-
137
- ## Model Information
138
-
139
- - **TTS Model:** SpeechBrain Tacotron2 (LJSpeech)
140
- - **Vocoder:** SpeechBrain HiFiGAN (LJSpeech)
141
- - **Default Sample Rate:** 22,050 Hz
142
- - **Text Limit:** 500 characters per request
143
-
144
- ## License
145
-
146
- MIT License
 
1
  ---
2
+ title: SpeechBrain TTS API Test
3
  emoji: 🗣️
4
  colorFrom: blue
5
  colorTo: purple
 
10
  license: mit
11
  ---
12
 
13
+ This is a test of A Text-to-Speech API built with FastAPI and SpeechBrain.