Spaces:

ghostai1
/

Audio-Translator

Running

App Files Files Community

ghostai1 commited on May 28

Commit

16959a9

verified ·

1 Parent(s): e41b97b

Update README.md

Browse files

Files changed (1) hide show

README.md +55 -1

README.md CHANGED Viewed

@@ -11,4 +11,58 @@ license: apache-2.0
 short_description: Audio Translator
 ---
-Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference

 short_description: Audio Translator
 ---
+# 🗣️ Audio Translator
+[![Hugging Face Space](https://img.shields.io/badge/HuggingFace-Spaces-blue?logo=huggingface)](https://huggingface.co/spaces/<YOUR-USERNAME>/audio-translator)
+[![Gradio UI](https://img.shields.io/badge/Gradio-5.31.0-brightgreen?logo=gradio)]
+[![Model: Whisper Tiny](https://img.shields.io/badge/ASR-Whisper--tiny-orange)]
+[![Translator: Deep-Translator](https://img.shields.io/badge/Translator-GoogleTranslator-blue)]
+[![TTS: gTTS](https://img.shields.io/badge/TTS-gTTS-yellow)]
+[![License](https://img.shields.io/badge/License-MIT-lightgrey)](LICENSE)
+---
+## 🚀 Overview
+Combine **ASR**, **machine translation**, and **neural TTS** into one **seamless audio pipeline**—100 % **CPU** on free-tier HF Spaces.
+Upload speech, auto-detect language, translate into English or Spanish, then hear it spoken back.
+> **AI buzzwords:**
+> • Automatic Speech Recognition (ASR) • Whisper Tiny • Neural Machine Translation • GoogleTranslator • Text-to-Speech • gTTS • Multi-modal AI • End-to-End Inference • Real-Time • Edge Deployment
+---
+## ✨ Features
+| 🔑 Feature                | 🔍 Description                                                 |
+|---------------------------|---------------------------------------------------------------|
+| **🎙️ ASR: Whisper-Tiny**    | Lightning-fast, on-device speech transcription (all languages) |
+| **🌐 Translation**          | Bidirectional English ↔ Spanish via Deep-Translator            |
+| **🗣️ Neural TTS**           | High-quality audio playback via the free Google Translate TTS |
+| **⚡ Zero-infra CPU**       | Runs on 2 vCPU / 16 GB RAM—no GPU or paid APIs needed         |
+| **🎨 Elegant UI**          | Intuitive Gradio Blocks—upload, buttons, transcripts, audio   |
+| **🔧 Fully Modular**        | Swap models or add logging/analytics with minimal edits       |
+---
+## 🏗️ Architecture & Workflow
+1. **Audio Upload**
+   User uploads any `.wav` or `.mp3` clip.
+2. **ASR**
+   OpenAI’s `whisper-tiny` decodes speech into text.
+3. **MT**
+   `deep-translator`’s GoogleTranslator converts text to chosen language.
+4. **TTS**
+   `gTTS` synthesizes the translated text into an `.mp3`.
+5. **UI Rendering**
+   Gradio presents the original transcript, the translation, and an audio player.
+---
+## 🛠️ Quick Start (Local Dev)
+```bash
+git clone https://github.com/<YOUR-USERNAME>/audio-translator.git
+cd audio-translator
+python3 -m venv venv && source venv/bin/activate
+pip install -r requirements.txt
+python app.py