Spaces:

usamaijaz2824
/

accent-classifier

Build error

App Files Files Community

accent-classifier / README.md

usamaijaz-ai

initial commit

5488aaa 3 months ago

preview code

raw

history blame contribute delete

1.79 kB

	---
	title: Accent Classifier + Transcriber
	emoji: 🎙️
	colorFrom: indigo
	colorTo: purple
	sdk: gradio
	sdk_version: "4.20.0"
	app_file: app.py
	pinned: false
	---


	# Accent Classifier + Speech Transcriber

	This Gradio app allows you to:

	- Upload or link to audio/video files
	- Automatically transcribe the speech (via OpenAI Whisper)
	- Detect the speaker's accent (28-class Wav2Vec2 model)
	- View a top-5 ranked list of likely accents with confidence scores

	---

	## How to Use

	Option 1: Upload an audio file
	- Supported formats: .mp3, .wav

	Option 2: Upload a video file
	- Supported format: .mp4 (audio will be extracted automatically)

	Option 3: Paste a direct .mp4 video URL
	- Must be a direct video file URL (not a webpage)
	- Example: a file hosted on archive.org or a CDN

	---


	## Not Supported

	- Loom, YouTube, Dropbox, or other webpage links (they don't serve real video files)
	- Download the video manually and upload it if needed

	---

	## Models Used

	Transcription:
	- openai/whisper-tiny: https://huggingface.co/openai/whisper-tiny

	Accent Classification:
	- ylacombe/accent-classifier: https://huggingface.co/ylacombe/accent-classifier

	---

	## Dependencies

	Handled automatically in Hugging Face Spaces.
	For local testing:

	pip install gradio transformers torch moviepy requests safetensors soundfile scipy

	You must also install ffmpeg:

	- macOS: brew install ffmpeg
	- Ubuntu: sudo apt install ffmpeg
	- Windows: Download from https://ffmpeg.org/

	---

	## How It Works

	1. Audio is extracted (if input is a video)
	2. Audio is converted to .wav and resampled to 16kHz
	3. Speech is transcribed using Whisper
	4. Accent is classified using a Wav2Vec2 model
	5. Output includes:
	- Top accent prediction
	- Confidence score
	- Top-5 accent list
	- Full transcription

	---