Edge-TTS-Text-to-Speech

Sleeping

App Files Files Community

Edge-TTS-Text-to-Speech / README.md

EmRa228

Update README.md

2d20b22 verified 2 months ago

preview code

raw

history blame contribute delete

1.91 kB

	---
	sdk: gradio
	python_version: "3.10"
	app_file: app.py
	---


	# Farsi Audio Chatbot

	This is a Gradio-based application that allows users to speak in Farsi, receive a response from a chatbot, and hear the response in Farsi audio.

	## Prerequisites
	- Python 3.8 or higher

	## Installation
	1. Clone this repository.
	2. Create and activate a virtual environment.
	3. Install dependencies: `pip install -r requirements.txt`
	4. Run the application: `python app.py`

	## How It Works
	- Speech-to-Text (STT): Uses [Whisper small](https://huggingface.co/openai/whisper-small) for converting Farsi speech to text.
	- Natural Language Processing (NLP): Uses [HooshvareLab/gpt2-fa](https://huggingface.co/HooshvareLab/gpt2-fa) to generate Farsi text responses.
	- Text-to-Speech (TTS): Uses [edge-tts](https://github.com/rany2/edge-tts) with the `fa-IR-FaridNeural` voice for Farsi audio output.

	## Deployment on Hugging Face Spaces
	To deploy on [Hugging Face Spaces](https://huggingface.co/spaces):
	1. Create a new Space.
	2. Upload this repository, including `requirements.txt`, `app.py`, and `README.md`.
	3. Ensure the Space has sufficient resources (at least 2GB RAM, GPU optional).
	4. The app will automatically build and run.

	Note: The current version processes audio inputs discretely (via button click). For continuous streaming, additional optimizations like real-time audio chunk processing are needed.

	## Limitations
	- Whisper-small may have reduced accuracy in noisy environments.
	- GPT2-fa is suitable for short responses but may struggle with complex conversations.
	- Continuous audio streaming is not yet implemented.

	## Citations
	- Whisper (Speech-to-Text): [openai/whisper-small](https://huggingface.co/openai/whisper-small)
	- Chatbot (NLP): [HooshvareLab/gpt2-fa](https://huggingface.co/HooshvareLab/gpt2-fa)
	- edge-tts (Text-to-Speech): [edge-tts](https://github.com/rany2/edge-tts)