Spaces:

usamaijaz2824
/

en-accent-classifier

Sleeping

App Files Files Community

en-accent-classifier / README.md

usamaijaz-ai

updated readme

b5e4ecb 3 months ago

preview code

raw

history blame contribute delete

2.42 kB

	---
	title: Accent Classifier + Transcriber
	emoji: 🎙️
	colorFrom: indigo
	colorTo: purple
	sdk: gradio
	sdk_version: "4.20.0"
	app_file: app.py
	pinned: false
	---

	# Accent Classifier + Speech Transcriber

	This Gradio app allows you to:

	- Upload or link to audio/video files
	- Automatically transcribe the speech (via OpenAI Whisper)
	- Detect the speaker's accent (28-class Wav2Vec2 model)
	- View a top-5 ranked list of likely accents with confidence scores

	---

	## How to Use

	Option 1: Upload an audio file
	- Supported formats: .mp3, .wav

	Option 2: Upload a video file
	- Supported format: .mp4 (audio will be extracted automatically)

	Option 3: Paste a direct .mp4 video URL
	- Must be a direct video file URL (not a webpage)
	- Example: a file hosted on archive.org or a CDN

	---

	## Not Supported

	- Loom, YouTube, Dropbox, or other webpage links (they don't serve real video files)
	- Download the video manually and upload it if needed

	---

	## Models Used

	Transcription:
	- [openai/whisper-tiny](https://huggingface.co/openai/whisper-tiny)

	Accent Classification:
	- [ylacombe/accent-classifier](https://huggingface.co/ylacombe/accent-classifier)

	---

	## Running Locally

	To set this up and run locally, follow these steps:

	1. Clone the repository
	```bash
	git clone https://huggingface.co/spaces/usamaijaz-ai/accent-classifier
	cd accent-classifier
	```

	2. Create a virtual environment (optional but recommended)
	```bash
	python -m venv venv
	source venv/bin/activate # On Windows: venv\Scripts\activate
	```

	3. Install the dependencies
	```bash
	pip install -r requirements.txt
	```

	If there’s no `requirements.txt`, use:
	```bash
	pip install gradio==4.20.0 transformers torch moviepy==1.0.3 requests safetensors soundfile scipy
	```

	4. Install ffmpeg
	- macOS: `brew install ffmpeg`
	- Ubuntu: `sudo apt install ffmpeg`
	- Windows: [Download here](https://ffmpeg.org/download.html) and add to PATH

	5. Run the app
	```bash
	python app.py
	```

	6. Access in your browser
	Visit `http://localhost:7860` to use the app locally.

	---

	## How It Works

	1. Audio is extracted (if input is a video)
	2. Audio is converted to `.wav` and resampled to 16kHz
	3. Speech is transcribed using Whisper
	4. Accent is classified using a Wav2Vec2 model
	5. Output includes:
	- Top accent prediction
	- Confidence score
	- Top-5 accent list
	- Full transcription

	---