EmRa228's picture
Update README.md
2d20b22 verified
---
sdk: gradio
python_version: "3.10"
app_file: app.py
---
# Farsi Audio Chatbot
This is a Gradio-based application that allows users to speak in Farsi, receive a response from a chatbot, and hear the response in Farsi audio.
## Prerequisites
- Python 3.8 or higher
## Installation
1. Clone this repository.
2. Create and activate a virtual environment.
3. Install dependencies: `pip install -r requirements.txt`
4. Run the application: `python app.py`
## How It Works
- **Speech-to-Text (STT)**: Uses [Whisper small](https://huggingface.co/openai/whisper-small) for converting Farsi speech to text.
- **Natural Language Processing (NLP)**: Uses [HooshvareLab/gpt2-fa](https://huggingface.co/HooshvareLab/gpt2-fa) to generate Farsi text responses.
- **Text-to-Speech (TTS)**: Uses [edge-tts](https://github.com/rany2/edge-tts) with the `fa-IR-FaridNeural` voice for Farsi audio output.
## Deployment on Hugging Face Spaces
To deploy on [Hugging Face Spaces](https://huggingface.co/spaces):
1. Create a new Space.
2. Upload this repository, including `requirements.txt`, `app.py`, and `README.md`.
3. Ensure the Space has sufficient resources (at least 2GB RAM, GPU optional).
4. The app will automatically build and run.
**Note**: The current version processes audio inputs discretely (via button click). For continuous streaming, additional optimizations like real-time audio chunk processing are needed.
## Limitations
- Whisper-small may have reduced accuracy in noisy environments.
- GPT2-fa is suitable for short responses but may struggle with complex conversations.
- Continuous audio streaming is not yet implemented.
## Citations
- Whisper (Speech-to-Text): [openai/whisper-small](https://huggingface.co/openai/whisper-small)
- Chatbot (NLP): [HooshvareLab/gpt2-fa](https://huggingface.co/HooshvareLab/gpt2-fa)
- edge-tts (Text-to-Speech): [edge-tts](https://github.com/rany2/edge-tts)