tusker123 commited on
Commit
9c84d33
·
verified ·
1 Parent(s): 190f6a1

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +92 -10
README.md CHANGED
@@ -1,13 +1,95 @@
1
  ---
2
- title: Accent Classifier
3
- emoji: 🌖
4
- colorFrom: pink
5
- colorTo: indigo
6
- sdk: gradio
7
- sdk_version: 5.31.0
8
- app_file: app.py
9
- pinned: false
10
- short_description: This Gradio app analyses English access from video file
 
 
 
 
 
 
 
11
  ---
12
 
13
- Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ language: en
3
+ tags:
4
+ - audio-classification
5
+ - accent-classification
6
+ - english-accents
7
+ - video-analysis
8
+ - gradio
9
+ - transformers
10
+ license: apache-2.0
11
+ model-index:
12
+ - name: english-accent-classifier
13
+ results:
14
+ - task: audio-classification
15
+ dataset: custom
16
+ metric: accuracy # Or other relevant metrics if available
17
+ value: N/A
18
  ---
19
 
20
+ # English Accent Classifier with Video Analysis
21
+
22
+ This Gradio application analyzes English accents from audio extracted from video files. You can provide a video either via a direct URL or by uploading a file from your local machine.
23
+
24
+ ## How it Works
25
+
26
+ 1. **Input Video:** Provide a video URL (MP4, Loom, Dropbox, Google Drive direct links) or upload a video file.
27
+ 2. **Video Processing:** The application downloads/processes the video.
28
+ 3. **Audio Extraction:** The full audio and a short segment (15 seconds) are extracted.
29
+ 4. **Language Detection:** The short audio is transcribed, and the language is detected.
30
+ 5. **Accent Classification (if English):** A longer audio segment (adjustable duration) is analyzed for English accent.
31
+ 6. **Results:** The detected language, predicted accent, confidence scores, and an audio player for the full extracted audio are displayed.
32
+
33
+ ## Features
34
+
35
+ * **English Accent Classification:** Predicts the accent in English audio.
36
+ * **Language Detection:** Ensures the audio is English before accent analysis.
37
+ * **Flexible Video Input:** Supports URLs and file uploads.
38
+ * **Adjustable Analysis Duration:** Users can set the audio analysis length.
39
+ * **Audio Playback:** Allows users to listen to the extracted audio.
40
+
41
+ ## Tech Stack
42
+
43
+ * [Gradio](https://gradio.app/): Interactive web UI.
44
+ * [Hugging Face Transformers](https://huggingface.co/transformers/): Pre-trained models and pipelines.
45
+ * [Requests](https://requests.readthedocs.io/en/latest/): Downloading video files.
46
+ * [MoviePy](https://zulko.github.io/moviepy/): Video editing for audio extraction.
47
+ * [PyTorch](https://pytorch.org/): Underlying deep learning framework.
48
+ * [Soundfile](https://pysoundfile.readthedocs.io/en/latest/): Audio file handling.
49
+
50
+ ## Models Used
51
+
52
+ * **Accent Classification:** `dima806/english_accents_classification`
53
+ * **Language Detection:** `alexneakameni/language_detection`
54
+ * **Automatic Speech Recognition:** `openai/whisper-tiny.en`
55
+
56
+ ## Usage
57
+
58
+ You can interact with the application directly in your browser. Provide a video URL or upload a file, adjust the analysis duration, and click "Analyze Video". The results will be displayed below.
59
+
60
+ ### Input Formats
61
+
62
+ * **Uploaded Video Files:** `.mp4`
63
+ * **Video URLs:**
64
+ * Direct MP4 links (ending in `.mp4`)
65
+ * Loom video share links (`https://www.loom.com/share/...`)
66
+ * Dropbox direct download links (MP4 links ending in `?dl=1`)
67
+ * Google Drive direct download links (`https://drive.google.com/uc?id=...&export=download`)
68
+
69
+ ### Unsupported Formats
70
+
71
+ * Webpages embedding videos (e.g., YouTube, news articles).
72
+ * Dropbox shared folder links.
73
+
74
+ ## FFmpeg Requirement
75
+
76
+ This application requires [FFmpeg](https://ffmpeg.org/) to be installed on your system for audio extraction from video files. Follow the installation instructions for your operating system on the FFmpeg website.
77
+
78
+ ## Troubleshooting
79
+
80
+ * **"Invalid URL"**: Ensure the URL meets the specified format requirements.
81
+ * **Audio/Video Processing Errors**: Likely due to missing or incorrectly configured FFmpeg.
82
+ * **Transcription Errors**: Audio may be unclear or contain little speech in the initial 15 seconds.
83
+ * **Non-English Language Detection**: The model is designed for English accent classification only.
84
+
85
+ ## Citation
86
+
87
+ If you use this application in your work, please consider citing the original models and the libraries used.
88
+
89
+ ```bibtex
90
+ @misc{huggingface_transformers,
91
+ author = {Hugging Face Team},
92
+ title = {Transformers: State-of-the-art Natural Language Processing},
93
+ year = {2019},
94
+ howpublished = {\url{[https://github.com/huggingface/transformers](https://github.com/huggingface/transformers)}},
95
+ }