File size: 3,016 Bytes
2f4675d
 
 
 
 
 
 
 
 
 
 
 
 
16959a9
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
525f362
 
 
 
1ad0bb5
525f362
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
---
title: Audio Translator
emoji: 🔥
colorFrom: pink
colorTo: purple
sdk: gradio
sdk_version: 5.31.0
app_file: app.py
pinned: false
license: apache-2.0
short_description: Audio Translator
---

# 🗣️ Audio Translator  
[![Hugging Face Space](https://img.shields.io/badge/HuggingFace-Spaces-blue?logo=huggingface)](https://huggingface.co/spaces/<YOUR-USERNAME>/audio-translator)  
[![Gradio UI](https://img.shields.io/badge/Gradio-5.31.0-brightgreen?logo=gradio)]  
[![Model: Whisper Tiny](https://img.shields.io/badge/ASR-Whisper--tiny-orange)]  
[![Translator: Deep-Translator](https://img.shields.io/badge/Translator-GoogleTranslator-blue)]  
[![TTS: gTTS](https://img.shields.io/badge/TTS-gTTS-yellow)]  
[![License](https://img.shields.io/badge/License-MIT-lightgrey)](LICENSE)

---

## 🚀 Overview  
Combine **ASR**, **machine translation**, and **neural TTS** into one **seamless audio pipeline**—100 % **CPU** on free-tier HF Spaces.  
Upload speech, auto-detect language, translate into English or Spanish, then hear it spoken back.

> **AI buzzwords:**  
> • Automatic Speech Recognition (ASR) • Whisper Tiny • Neural Machine Translation • GoogleTranslator • Text-to-Speech • gTTS • Multi-modal AI • End-to-End Inference • Real-Time • Edge Deployment

---

## ✨ Features

| 🔑 Feature                | 🔍 Description                                                 |
|---------------------------|---------------------------------------------------------------|
| **🎙️ ASR: Whisper-Tiny**    | Lightning-fast, on-device speech transcription (all languages) |
| **🌐 Translation**          | Bidirectional English ↔ Spanish via Deep-Translator            |
| **🗣️ Neural TTS**           | High-quality audio playback via the free Google Translate TTS |
| **⚡ Zero-infra CPU**       | Runs on 2 vCPU / 16 GB RAM—no GPU or paid APIs needed         |
| **🎨 Elegant UI**          | Intuitive Gradio Blocks—upload, buttons, transcripts, audio   |
| **🔧 Fully Modular**        | Swap models or add logging/analytics with minimal edits       |

---

## 🏗️ Architecture & Workflow

1. **Audio Upload**  
   User uploads any `.wav` or `.mp3` clip.  
2. **ASR**  
   OpenAI’s `whisper-tiny` decodes speech into text.  
3. **MT**  
   `deep-translator`’s GoogleTranslator converts text to chosen language.  
4. **TTS**  
   `gTTS` synthesizes the translated text into an `.mp3`.  
5. **UI Rendering**  
   Gradio presents the original transcript, the translation, and an audio player.

---

## 🛠️ Quick Start (Local Dev)

```bash
git clone https://github.com/<YOUR-USERNAME>/audio-translator.git
cd audio-translator
python3 -m venv venv && source venv/bin/activate
pip install -r requirements.txt
python app.py

## Latest Update

- Upgraded Whisper-Tiny model for faster ASR. - May 29, 2025 📝
- Improved translation accuracy for Spanish. ⚡ - May 30, 2025 📝

**Website**: https://ghostainews.com/
**Discord**: https://discord.gg/BfA23aYz