leeksang's picture
Update readme.md
75052ff verified
|
raw
history blame
878 Bytes
metadata
title: Accent Classifier
emoji: πŸŽ™οΈ
colorFrom: teal
colorTo: cyan
sdk: gradio
sdk_version: 3.38.1
app_file: app.py
pinned: false

Accent Classifier πŸŽ™οΈ

This app downloads a public YouTube or Vimeo video, extracts its audio, and classifies the speaker's accent (or rather, speaker ID as a proxy) using a Hugging Face model.

How it works

  1. You provide a video URL.
  2. The app downloads the audio using yt-dlp.
  3. It extracts the audio in a format suitable for the model (wav, 16kHz, mono).
  4. It runs the superb/wav2vec2-base-superb-sid model to classify the speaker.
  5. Displays the predicted speaker ID and confidence.

Requirements

  • Python 3.8+
  • yt-dlp
  • ffmpeg installed on your system and accessible from the command line.
  • gradio for the UI.
  • transformers from Hugging Face.

Usage

Run the app:

python app.py