speechtranslate / README.md
ghostai1's picture
Update README.md
67a1108 verified

A newer version of the Gradio SDK is available: 5.42.0

Upgrade
metadata
title: Speechtranslate
emoji: ๐Ÿ†
colorFrom: red
colorTo: yellow
sdk: gradio
sdk_version: 5.31.0
app_file: app.py
pinned: false
license: apache-2.0
short_description: text2speech+translate

๐ŸŒ๐Ÿ’ฌ Translate & Speak + Session Log

Hugging Face Space
Gradio UI
Python
License


๐Ÿš€ Overview

Harness the power of real-time NLP, on-the-fly translation, and neural TTS in one elegant, CPU-only pipeline. This Space transforms user text into spoken audioโ€”any English or Spanish input gets auto-detected, translated, and voiced backโ€”while maintaining a live session log for data-driven insights.

Key AI buzzwords:

Natural Language Processing (NLP) โ€ข Neural Text-to-Speech โ€ข Zero-shot language detection โ€ข Real-time inference โ€ข Session state management โ€ข Cloud-native deployment โ€ข User-centric design โ€ข Cost-efficient CPU runtime


โœจ Features

๐Ÿ”‘ Feature ๐Ÿ” Description
๐Ÿ”„ Bidirectional Translation English โ†” Spanish via deep-translatorโ€™s GoogleTranslator (auto-detect source language)
๐Ÿ—ฃ๏ธ Neural TTS High-fidelity speech generation with gTTS (Google Translate TTS)
๐Ÿ•’ Real-Time Processing Sub-second response on free CPU tierโ€”no GPUs, no paid APIs
๐Ÿ“Š Session Logging Data-driven UX: every input, translation, and audio event recorded in an interactive DataFrame
๐ŸŽจ Interactive UI Sleek Gradio Blocks interface with controls for text input, language selector, and playback
๐Ÿ”ง Zero-Config Dev Drop-in app.py + requirements.txtโ€”Spaces auto-builds and deploys
๐Ÿ’ก Extensible Architecture Modular pipelinesโ€”swap translators, TTS engines, or add analytics with minimal code changes

๐Ÿ—๏ธ Architecture & Workflow

  1. User Input
    • Free-form text in any language (auto-detects English/Spanish).
  2. Translation
    • deep-translator โ†’ Google Translate API wrapper โ†’ high-accuracy text conversion.
  3. Text-to-Speech
    • gTTS โ†’ neural waveform synthesis โ†’ MP3 output.
  4. Session Log
    • Maintains a rolling table of [Input, Target Language, Translated Text] for audit trails and usage analytics.
  5. UI Rendering
    • Gradio Blocks orchestrates inputs, buttons, outputs, and state, delivering a seamless end-to-end experience.

๐Ÿ› ๏ธ Quick Start (Local Development)

git clone https://github.com/your-username/translate-speak-log.git
cd translate-speak-log
python3 -m venv venv && source venv/bin/activate
pip install -r requirements.txt
python app.py