Spaces:
Running
Running
Update README.md
Browse files
README.md
CHANGED
@@ -14,7 +14,7 @@ license: mit
|
|
14 |
|
15 |
# Accent Analyzer
|
16 |
|
17 |
-
This is a Streamlit-based web application that analyzes the English accent in spoken videos. Users can provide a public video URL (MP4), receive a transcription of the speech, and ask follow-up questions based on the transcript using Gemma3.
|
18 |
|
19 |
## What It Does
|
20 |
|
@@ -31,7 +31,7 @@ This is a Streamlit-based web application that analyzes the English accent in sp
|
|
31 |
- **Streamlit** β UI
|
32 |
- **OpenAI Whisper (medium)**: For speech-to-text transcription.
|
33 |
- **Jzuluaga/accent-id-commonaccent_xlsr-en-english**: For English accent classification.
|
34 |
-
- **Gemma3 via Ollama**: For generating answers to follow-up questions using context from the transcript.
|
35 |
- **Docker** β containerized for deployment
|
36 |
- **Hugging Face Spaces** β for hosting with CPU
|
37 |
|
@@ -42,6 +42,8 @@ This is a Streamlit-based web application that analyzes the English accent in sp
|
|
42 |
```
|
43 |
accent-analyzer/
|
44 |
βββ Dockerfile # Container setup
|
|
|
|
|
45 |
βββ requirements.txt # Python dependencies
|
46 |
βββ streamlit_app.py # Main UI app
|
47 |
βββ src/
|
@@ -109,7 +111,7 @@ langgraph>=0.0.20
|
|
109 |
|
110 |
## Notes
|
111 |
|
112 |
-
- Gemma3 is accessed via **Ollama** inside Docker β ensure it pulls on build.
|
113 |
- `custome_interface.py` is required by the accent model β itβs automatically downloaded in Dockerfile.
|
114 |
- Video URLs must be **direct links** to `.mp4` files.
|
115 |
|
@@ -138,7 +140,7 @@ This project uses the following models, frameworks, and tools:
|
|
138 |
- [SpeechBrain](https://speechbrain.readthedocs.io/): Toolkit used for building and fine-tuning speech processing models.
|
139 |
- [Accent-ID CommonAccent](https://huggingface.co/Jzuluaga/accent-id-commonaccent_xlsr-en-english): Fine-tuned wav2vec2 model hosted on Hugging Face for English accent classification.
|
140 |
- [CustomEncoderWav2vec2Classifier](https://huggingface.co/Jzuluaga/accent-id-commonaccent_xlsr-en-english/blob/main/custom_interface.py): Custom interface used to load and run the accent model.
|
141 |
-
- [Gemma3](https://ollama.com/library/gemma3) via [Ollama](https://ollama.com): Large language model used for natural language follow-up based on transcripts.
|
142 |
- [Streamlit](https://streamlit.io): Python framework for building web applications.
|
143 |
- [Hugging Face Spaces](https://huggingface.co/spaces): Platform used for deploying this application on GPU infrastructure.
|
144 |
|
|
|
14 |
|
15 |
# Accent Analyzer
|
16 |
|
17 |
+
This is a Streamlit-based web application that analyzes the English accent in spoken videos. Users can provide a public video URL (MP4), receive a transcription of the speech, and ask follow-up questions based on the transcript using Gemma3:1b.
|
18 |
|
19 |
## What It Does
|
20 |
|
|
|
31 |
- **Streamlit** β UI
|
32 |
- **OpenAI Whisper (medium)**: For speech-to-text transcription.
|
33 |
- **Jzuluaga/accent-id-commonaccent_xlsr-en-english**: For English accent classification.
|
34 |
+
- **Gemma3:1b via Ollama**: For generating answers to follow-up questions using context from the transcript.
|
35 |
- **Docker** β containerized for deployment
|
36 |
- **Hugging Face Spaces** β for hosting with CPU
|
37 |
|
|
|
42 |
```
|
43 |
accent-analyzer/
|
44 |
βββ Dockerfile # Container setup
|
45 |
+
βββ start.sh # Serving Ollama and app setup
|
46 |
+
βββ README.md # Instruction about the app
|
47 |
βββ requirements.txt # Python dependencies
|
48 |
βββ streamlit_app.py # Main UI app
|
49 |
βββ src/
|
|
|
111 |
|
112 |
## Notes
|
113 |
|
114 |
+
- Gemma3:1b is accessed via **Ollama** inside Docker β ensure it pulls on build.
|
115 |
- `custome_interface.py` is required by the accent model β itβs automatically downloaded in Dockerfile.
|
116 |
- Video URLs must be **direct links** to `.mp4` files.
|
117 |
|
|
|
140 |
- [SpeechBrain](https://speechbrain.readthedocs.io/): Toolkit used for building and fine-tuning speech processing models.
|
141 |
- [Accent-ID CommonAccent](https://huggingface.co/Jzuluaga/accent-id-commonaccent_xlsr-en-english): Fine-tuned wav2vec2 model hosted on Hugging Face for English accent classification.
|
142 |
- [CustomEncoderWav2vec2Classifier](https://huggingface.co/Jzuluaga/accent-id-commonaccent_xlsr-en-english/blob/main/custom_interface.py): Custom interface used to load and run the accent model.
|
143 |
+
- [Gemma3:1b](https://ollama.com/library/gemma3:1b) via [Ollama](https://ollama.com): Large language model used for natural language follow-up based on transcripts.
|
144 |
- [Streamlit](https://streamlit.io): Python framework for building web applications.
|
145 |
- [Hugging Face Spaces](https://huggingface.co/spaces): Platform used for deploying this application on GPU infrastructure.
|
146 |
|