Spaces:
Sleeping
Sleeping
sdk: gradio | |
python_version: "3.10" | |
app_file: app.py | |
# Farsi Audio Chatbot | |
This is a Gradio-based application that allows users to speak in Farsi, receive a response from a chatbot, and hear the response in Farsi audio. | |
## Prerequisites | |
- Python 3.8 or higher | |
## Installation | |
1. Clone this repository. | |
2. Create and activate a virtual environment. | |
3. Install dependencies: `pip install -r requirements.txt` | |
4. Run the application: `python app.py` | |
## How It Works | |
- **Speech-to-Text (STT)**: Uses [Whisper small](https://huggingface.co/openai/whisper-small) for converting Farsi speech to text. | |
- **Natural Language Processing (NLP)**: Uses [HooshvareLab/gpt2-fa](https://huggingface.co/HooshvareLab/gpt2-fa) to generate Farsi text responses. | |
- **Text-to-Speech (TTS)**: Uses [edge-tts](https://github.com/rany2/edge-tts) with the `fa-IR-FaridNeural` voice for Farsi audio output. | |
## Deployment on Hugging Face Spaces | |
To deploy on [Hugging Face Spaces](https://huggingface.co/spaces): | |
1. Create a new Space. | |
2. Upload this repository, including `requirements.txt`, `app.py`, and `README.md`. | |
3. Ensure the Space has sufficient resources (at least 2GB RAM, GPU optional). | |
4. The app will automatically build and run. | |
**Note**: The current version processes audio inputs discretely (via button click). For continuous streaming, additional optimizations like real-time audio chunk processing are needed. | |
## Limitations | |
- Whisper-small may have reduced accuracy in noisy environments. | |
- GPT2-fa is suitable for short responses but may struggle with complex conversations. | |
- Continuous audio streaming is not yet implemented. | |
## Citations | |
- Whisper (Speech-to-Text): [openai/whisper-small](https://huggingface.co/openai/whisper-small) | |
- Chatbot (NLP): [HooshvareLab/gpt2-fa](https://huggingface.co/HooshvareLab/gpt2-fa) | |
- edge-tts (Text-to-Speech): [edge-tts](https://github.com/rany2/edge-tts) |