Spaces:
Running
Running
Update README.md
Browse files
README.md
CHANGED
@@ -11,4 +11,58 @@ license: apache-2.0
|
|
11 |
short_description: Audio Translator
|
12 |
---
|
13 |
|
14 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
11 |
short_description: Audio Translator
|
12 |
---
|
13 |
|
14 |
+
# 🗣️ Audio Translator
|
15 |
+
[](https://huggingface.co/spaces/<YOUR-USERNAME>/audio-translator)
|
16 |
+
[]
|
17 |
+
[]
|
18 |
+
[]
|
19 |
+
[]
|
20 |
+
[](LICENSE)
|
21 |
+
|
22 |
+
---
|
23 |
+
|
24 |
+
## 🚀 Overview
|
25 |
+
Combine **ASR**, **machine translation**, and **neural TTS** into one **seamless audio pipeline**—100 % **CPU** on free-tier HF Spaces.
|
26 |
+
Upload speech, auto-detect language, translate into English or Spanish, then hear it spoken back.
|
27 |
+
|
28 |
+
> **AI buzzwords:**
|
29 |
+
> • Automatic Speech Recognition (ASR) • Whisper Tiny • Neural Machine Translation • GoogleTranslator • Text-to-Speech • gTTS • Multi-modal AI • End-to-End Inference • Real-Time • Edge Deployment
|
30 |
+
|
31 |
+
---
|
32 |
+
|
33 |
+
## ✨ Features
|
34 |
+
|
35 |
+
| 🔑 Feature | 🔍 Description |
|
36 |
+
|---------------------------|---------------------------------------------------------------|
|
37 |
+
| **🎙️ ASR: Whisper-Tiny** | Lightning-fast, on-device speech transcription (all languages) |
|
38 |
+
| **🌐 Translation** | Bidirectional English ↔ Spanish via Deep-Translator |
|
39 |
+
| **🗣️ Neural TTS** | High-quality audio playback via the free Google Translate TTS |
|
40 |
+
| **⚡ Zero-infra CPU** | Runs on 2 vCPU / 16 GB RAM—no GPU or paid APIs needed |
|
41 |
+
| **🎨 Elegant UI** | Intuitive Gradio Blocks—upload, buttons, transcripts, audio |
|
42 |
+
| **🔧 Fully Modular** | Swap models or add logging/analytics with minimal edits |
|
43 |
+
|
44 |
+
---
|
45 |
+
|
46 |
+
## 🏗️ Architecture & Workflow
|
47 |
+
|
48 |
+
1. **Audio Upload**
|
49 |
+
User uploads any `.wav` or `.mp3` clip.
|
50 |
+
2. **ASR**
|
51 |
+
OpenAI’s `whisper-tiny` decodes speech into text.
|
52 |
+
3. **MT**
|
53 |
+
`deep-translator`’s GoogleTranslator converts text to chosen language.
|
54 |
+
4. **TTS**
|
55 |
+
`gTTS` synthesizes the translated text into an `.mp3`.
|
56 |
+
5. **UI Rendering**
|
57 |
+
Gradio presents the original transcript, the translation, and an audio player.
|
58 |
+
|
59 |
+
---
|
60 |
+
|
61 |
+
## 🛠️ Quick Start (Local Dev)
|
62 |
+
|
63 |
+
```bash
|
64 |
+
git clone https://github.com/<YOUR-USERNAME>/audio-translator.git
|
65 |
+
cd audio-translator
|
66 |
+
python3 -m venv venv && source venv/bin/activate
|
67 |
+
pip install -r requirements.txt
|
68 |
+
python app.py
|