ash-171 commited on
Commit
7474813
Β·
verified Β·
1 Parent(s): 21019df

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +6 -4
README.md CHANGED
@@ -14,7 +14,7 @@ license: mit
14
 
15
  # Accent Analyzer
16
 
17
- This is a Streamlit-based web application that analyzes the English accent in spoken videos. Users can provide a public video URL (MP4), receive a transcription of the speech, and ask follow-up questions based on the transcript using Gemma3.
18
 
19
  ## What It Does
20
 
@@ -31,7 +31,7 @@ This is a Streamlit-based web application that analyzes the English accent in sp
31
  - **Streamlit** β€” UI
32
  - **OpenAI Whisper (medium)**: For speech-to-text transcription.
33
  - **Jzuluaga/accent-id-commonaccent_xlsr-en-english**: For English accent classification.
34
- - **Gemma3 via Ollama**: For generating answers to follow-up questions using context from the transcript.
35
  - **Docker** β€” containerized for deployment
36
  - **Hugging Face Spaces** β€” for hosting with CPU
37
 
@@ -42,6 +42,8 @@ This is a Streamlit-based web application that analyzes the English accent in sp
42
  ```
43
  accent-analyzer/
44
  β”œβ”€β”€ Dockerfile # Container setup
 
 
45
  β”œβ”€β”€ requirements.txt # Python dependencies
46
  β”œβ”€β”€ streamlit_app.py # Main UI app
47
  └── src/
@@ -109,7 +111,7 @@ langgraph>=0.0.20
109
 
110
  ## Notes
111
 
112
- - Gemma3 is accessed via **Ollama** inside Docker β€” ensure it pulls on build.
113
  - `custome_interface.py` is required by the accent model β€” it’s automatically downloaded in Dockerfile.
114
  - Video URLs must be **direct links** to `.mp4` files.
115
 
@@ -138,7 +140,7 @@ This project uses the following models, frameworks, and tools:
138
  - [SpeechBrain](https://speechbrain.readthedocs.io/): Toolkit used for building and fine-tuning speech processing models.
139
  - [Accent-ID CommonAccent](https://huggingface.co/Jzuluaga/accent-id-commonaccent_xlsr-en-english): Fine-tuned wav2vec2 model hosted on Hugging Face for English accent classification.
140
  - [CustomEncoderWav2vec2Classifier](https://huggingface.co/Jzuluaga/accent-id-commonaccent_xlsr-en-english/blob/main/custom_interface.py): Custom interface used to load and run the accent model.
141
- - [Gemma3](https://ollama.com/library/gemma3) via [Ollama](https://ollama.com): Large language model used for natural language follow-up based on transcripts.
142
  - [Streamlit](https://streamlit.io): Python framework for building web applications.
143
  - [Hugging Face Spaces](https://huggingface.co/spaces): Platform used for deploying this application on GPU infrastructure.
144
 
 
14
 
15
  # Accent Analyzer
16
 
17
+ This is a Streamlit-based web application that analyzes the English accent in spoken videos. Users can provide a public video URL (MP4), receive a transcription of the speech, and ask follow-up questions based on the transcript using Gemma3:1b.
18
 
19
  ## What It Does
20
 
 
31
  - **Streamlit** β€” UI
32
  - **OpenAI Whisper (medium)**: For speech-to-text transcription.
33
  - **Jzuluaga/accent-id-commonaccent_xlsr-en-english**: For English accent classification.
34
+ - **Gemma3:1b via Ollama**: For generating answers to follow-up questions using context from the transcript.
35
  - **Docker** β€” containerized for deployment
36
  - **Hugging Face Spaces** β€” for hosting with CPU
37
 
 
42
  ```
43
  accent-analyzer/
44
  β”œβ”€β”€ Dockerfile # Container setup
45
+ β”œβ”€β”€ start.sh # Serving Ollama and app setup
46
+ β”œβ”€β”€ README.md # Instruction about the app
47
  β”œβ”€β”€ requirements.txt # Python dependencies
48
  β”œβ”€β”€ streamlit_app.py # Main UI app
49
  └── src/
 
111
 
112
  ## Notes
113
 
114
+ - Gemma3:1b is accessed via **Ollama** inside Docker β€” ensure it pulls on build.
115
  - `custome_interface.py` is required by the accent model β€” it’s automatically downloaded in Dockerfile.
116
  - Video URLs must be **direct links** to `.mp4` files.
117
 
 
140
  - [SpeechBrain](https://speechbrain.readthedocs.io/): Toolkit used for building and fine-tuning speech processing models.
141
  - [Accent-ID CommonAccent](https://huggingface.co/Jzuluaga/accent-id-commonaccent_xlsr-en-english): Fine-tuned wav2vec2 model hosted on Hugging Face for English accent classification.
142
  - [CustomEncoderWav2vec2Classifier](https://huggingface.co/Jzuluaga/accent-id-commonaccent_xlsr-en-english/blob/main/custom_interface.py): Custom interface used to load and run the accent model.
143
+ - [Gemma3:1b](https://ollama.com/library/gemma3:1b) via [Ollama](https://ollama.com): Large language model used for natural language follow-up based on transcripts.
144
  - [Streamlit](https://streamlit.io): Python framework for building web applications.
145
  - [Hugging Face Spaces](https://huggingface.co/spaces): Platform used for deploying this application on GPU infrastructure.
146