EmRa228 commited on
Commit
99a84dd
·
verified ·
1 Parent(s): 6d598f3

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +36 -14
README.md CHANGED
@@ -1,14 +1,36 @@
1
- ---
2
- title: Edge TTS Text To Speech
3
- emoji: 👁
4
- colorFrom: pink
5
- colorTo: yellow
6
- sdk: gradio
7
- app_port: 7860
8
- sdk_version: 5.29.0
9
- app_file: app.py
10
- pinned: false
11
- license: gpl-2.0
12
- ---
13
-
14
- Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Farsi Audio Chatbot
2
+
3
+ This is a Gradio-based application that allows users to speak in Farsi, receive a response from a chatbot, and hear the response in Farsi audio.
4
+
5
+ ## Prerequisites
6
+ - Python 3.8 or higher
7
+
8
+ ## Installation
9
+ 1. Clone this repository.
10
+ 2. Create and activate a virtual environment.
11
+ 3. Install dependencies: `pip install -r requirements.txt`
12
+ 4. Run the application: `python app.py`
13
+
14
+ ## How It Works
15
+ - **Speech-to-Text (STT)**: Uses [Whisper small](https://huggingface.co/openai/whisper-small) for converting Farsi speech to text.
16
+ - **Natural Language Processing (NLP)**: Uses [HooshvareLab/gpt2-fa](https://huggingface.co/HooshvareLab/gpt2-fa) to generate Farsi text responses.
17
+ - **Text-to-Speech (TTS)**: Uses [edge-tts](https://github.com/rany2/edge-tts) with the `fa-IR-FaridNeural` voice for Farsi audio output.
18
+
19
+ ## Deployment on Hugging Face Spaces
20
+ To deploy on [Hugging Face Spaces](https://huggingface.co/spaces):
21
+ 1. Create a new Space.
22
+ 2. Upload this repository, including `requirements.txt`, `app.py`, and `README.md`.
23
+ 3. Ensure the Space has sufficient resources (at least 2GB RAM, GPU optional).
24
+ 4. The app will automatically build and run.
25
+
26
+ **Note**: The current version processes audio inputs discretely (via button click). For continuous streaming, additional optimizations like real-time audio chunk processing are needed.
27
+
28
+ ## Limitations
29
+ - Whisper-small may have reduced accuracy in noisy environments.
30
+ - GPT2-fa is suitable for short responses but may struggle with complex conversations.
31
+ - Continuous audio streaming is not yet implemented.
32
+
33
+ ## Citations
34
+ - Whisper (Speech-to-Text): [openai/whisper-small](https://huggingface.co/openai/whisper-small)
35
+ - Chatbot (NLP): [HooshvareLab/gpt2-fa](https://huggingface.co/HooshvareLab/gpt2-fa)
36
+ - edge-tts (Text-to-Speech): [edge-tts](https://github.com/rany2/edge-tts)