Spaces:
Running
Running
Update README.md
Browse files
README.md
CHANGED
|
@@ -11,4 +11,58 @@ license: apache-2.0
|
|
| 11 |
short_description: Audio Translator
|
| 12 |
---
|
| 13 |
|
| 14 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 11 |
short_description: Audio Translator
|
| 12 |
---
|
| 13 |
|
| 14 |
+
# 🗣️ Audio Translator
|
| 15 |
+
[](https://huggingface.co/spaces/<YOUR-USERNAME>/audio-translator)
|
| 16 |
+
[]
|
| 17 |
+
[]
|
| 18 |
+
[]
|
| 19 |
+
[]
|
| 20 |
+
[](LICENSE)
|
| 21 |
+
|
| 22 |
+
---
|
| 23 |
+
|
| 24 |
+
## 🚀 Overview
|
| 25 |
+
Combine **ASR**, **machine translation**, and **neural TTS** into one **seamless audio pipeline**—100 % **CPU** on free-tier HF Spaces.
|
| 26 |
+
Upload speech, auto-detect language, translate into English or Spanish, then hear it spoken back.
|
| 27 |
+
|
| 28 |
+
> **AI buzzwords:**
|
| 29 |
+
> • Automatic Speech Recognition (ASR) • Whisper Tiny • Neural Machine Translation • GoogleTranslator • Text-to-Speech • gTTS • Multi-modal AI • End-to-End Inference • Real-Time • Edge Deployment
|
| 30 |
+
|
| 31 |
+
---
|
| 32 |
+
|
| 33 |
+
## ✨ Features
|
| 34 |
+
|
| 35 |
+
| 🔑 Feature | 🔍 Description |
|
| 36 |
+
|---------------------------|---------------------------------------------------------------|
|
| 37 |
+
| **🎙️ ASR: Whisper-Tiny** | Lightning-fast, on-device speech transcription (all languages) |
|
| 38 |
+
| **🌐 Translation** | Bidirectional English ↔ Spanish via Deep-Translator |
|
| 39 |
+
| **🗣️ Neural TTS** | High-quality audio playback via the free Google Translate TTS |
|
| 40 |
+
| **⚡ Zero-infra CPU** | Runs on 2 vCPU / 16 GB RAM—no GPU or paid APIs needed |
|
| 41 |
+
| **🎨 Elegant UI** | Intuitive Gradio Blocks—upload, buttons, transcripts, audio |
|
| 42 |
+
| **🔧 Fully Modular** | Swap models or add logging/analytics with minimal edits |
|
| 43 |
+
|
| 44 |
+
---
|
| 45 |
+
|
| 46 |
+
## 🏗️ Architecture & Workflow
|
| 47 |
+
|
| 48 |
+
1. **Audio Upload**
|
| 49 |
+
User uploads any `.wav` or `.mp3` clip.
|
| 50 |
+
2. **ASR**
|
| 51 |
+
OpenAI’s `whisper-tiny` decodes speech into text.
|
| 52 |
+
3. **MT**
|
| 53 |
+
`deep-translator`’s GoogleTranslator converts text to chosen language.
|
| 54 |
+
4. **TTS**
|
| 55 |
+
`gTTS` synthesizes the translated text into an `.mp3`.
|
| 56 |
+
5. **UI Rendering**
|
| 57 |
+
Gradio presents the original transcript, the translation, and an audio player.
|
| 58 |
+
|
| 59 |
+
---
|
| 60 |
+
|
| 61 |
+
## 🛠️ Quick Start (Local Dev)
|
| 62 |
+
|
| 63 |
+
```bash
|
| 64 |
+
git clone https://github.com/<YOUR-USERNAME>/audio-translator.git
|
| 65 |
+
cd audio-translator
|
| 66 |
+
python3 -m venv venv && source venv/bin/activate
|
| 67 |
+
pip install -r requirements.txt
|
| 68 |
+
python app.py
|