accent-classifier / README.md
usamaijaz-ai's picture
initial commit
5488aaa
---
title: Accent Classifier + Transcriber
emoji: ๐ŸŽ™๏ธ
colorFrom: indigo
colorTo: purple
sdk: gradio
sdk_version: "4.20.0"
app_file: app.py
pinned: false
---
# Accent Classifier + Speech Transcriber
This Gradio app allows you to:
- Upload or link to audio/video files
- Automatically transcribe the speech (via OpenAI Whisper)
- Detect the speaker's accent (28-class Wav2Vec2 model)
- View a top-5 ranked list of likely accents with confidence scores
---
## How to Use
Option 1: Upload an audio file
- Supported formats: .mp3, .wav
Option 2: Upload a video file
- Supported format: .mp4 (audio will be extracted automatically)
Option 3: Paste a direct .mp4 video URL
- Must be a direct video file URL (not a webpage)
- Example: a file hosted on archive.org or a CDN
---
## Not Supported
- Loom, YouTube, Dropbox, or other webpage links (they don't serve real video files)
- Download the video manually and upload it if needed
---
## Models Used
Transcription:
- openai/whisper-tiny: https://huggingface.co/openai/whisper-tiny
Accent Classification:
- ylacombe/accent-classifier: https://huggingface.co/ylacombe/accent-classifier
---
## Dependencies
Handled automatically in Hugging Face Spaces.
For local testing:
pip install gradio transformers torch moviepy requests safetensors soundfile scipy
You must also install ffmpeg:
- macOS: brew install ffmpeg
- Ubuntu: sudo apt install ffmpeg
- Windows: Download from https://ffmpeg.org/
---
## How It Works
1. Audio is extracted (if input is a video)
2. Audio is converted to .wav and resampled to 16kHz
3. Speech is transcribed using Whisper
4. Accent is classified using a Wav2Vec2 model
5. Output includes:
- Top accent prediction
- Confidence score
- Top-5 accent list
- Full transcription
---