Spaces:
Runtime error
Runtime error
Update README.md
Browse files
README.md
CHANGED
@@ -11,4 +11,118 @@ license: mit
|
|
11 |
short_description: Gradio Whisper Transcription App (MCP)
|
12 |
---
|
13 |
|
14 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
11 |
short_description: Gradio Whisper Transcription App (MCP)
|
12 |
---
|
13 |
|
14 |
+
Here's a complete `README.md` tailored for your project hosted at [https://huggingface.co/spaces/Dreamcatcher23/mcp-server-track](https://huggingface.co/spaces/Dreamcatcher23/mcp-server-track). It assumes you're offering an audio transcription service using OpenAI's Whisper via Modal and exposing it as an MCP-compliant server using Gradio:
|
15 |
+
|
16 |
+
---
|
17 |
+
|
18 |
+
```markdown
|
19 |
+
# 🎙️ Whisper Transcription Tool – MCP Server
|
20 |
+
|
21 |
+
Welcome to the **MCP Server Track** submission for the Hugging Face & OpenAI Agents Hackathon!
|
22 |
+
|
23 |
+
This tool provides speech-to-text transcription using OpenAI’s Whisper model, deployed via [Modal](https://modal.com/), and exposes the service through a Gradio interface that supports the **Model Context Protocol (MCP)**. Agents can access this tool via HTTP or streaming endpoints.
|
24 |
+
|
25 |
+
---
|
26 |
+
|
27 |
+
## 🧠 What It Does
|
28 |
+
|
29 |
+
- Accepts an audio URL (MP3, WAV, FLAC, etc.)
|
30 |
+
- Transcribes the speech to text using Whisper (`base` model)
|
31 |
+
- Returns clean, readable output
|
32 |
+
- Exposes an MCP-compliant API endpoint
|
33 |
+
|
34 |
+
---
|
35 |
+
|
36 |
+
## 🚀 How to Use
|
37 |
+
|
38 |
+
### 🔹 Web Interface (UI)
|
39 |
+
1. Enter an audio URL (e.g., a `.flac` or `.wav` file).
|
40 |
+
2. Click the **Submit** button.
|
41 |
+
3. View the transcribed text output instantly.
|
42 |
+
|
43 |
+
### 🔹 As an MCP Tool (Programmatic Access)
|
44 |
+
This app can be invoked by agents (e.g., SmolAI, LangChain, or custom agent scripts) using the MCP specification.
|
45 |
+
|
46 |
+
- Endpoint: `/predict`
|
47 |
+
- Method: POST
|
48 |
+
- Input Schema: `{ "data": [ "AUDIO_URL_HERE" ] }`
|
49 |
+
- Output Schema: `{ "data": [ "TRANSCRIPTION_TEXT" ] }`
|
50 |
+
|
51 |
+
---
|
52 |
+
|
53 |
+
## 🛠️ Stack
|
54 |
+
|
55 |
+
| Layer | Tech |
|
56 |
+
|--------------|-------------------------------------|
|
57 |
+
| Frontend | [Gradio](https://gradio.app/) |
|
58 |
+
| Inference | [OpenAI Whisper](https://github.com/openai/whisper) |
|
59 |
+
| Hosting | [Hugging Face Spaces](https://huggingface.co/spaces) |
|
60 |
+
| Remote Compute | [Modal](https://modal.com/) |
|
61 |
+
| Protocol | Model Context Protocol (MCP) |
|
62 |
+
|
63 |
+
---
|
64 |
+
|
65 |
+
## 📦 Example Input
|
66 |
+
|
67 |
+
```
|
68 |
+
|
69 |
+
[https://huggingface.co/datasets/Narsil/asr\_dummy/resolve/main/mlk.flac](https://huggingface.co/datasets/Narsil/asr_dummy/resolve/main/mlk.flac)
|
70 |
+
|
71 |
+
```
|
72 |
+
|
73 |
+
Expected output:
|
74 |
+
|
75 |
+
```
|
76 |
+
|
77 |
+
I have a dream that one day this nation will rise up and live out the true meaning of its creed.
|
78 |
+
|
79 |
+
````
|
80 |
+
|
81 |
+
---
|
82 |
+
|
83 |
+
## 📎 Notes
|
84 |
+
|
85 |
+
- Supports English only (for now).
|
86 |
+
- Long-form audio and larger models (e.g., `medium`, `large`) can be added later.
|
87 |
+
- You can extend it to support file uploads or streaming audio.
|
88 |
+
|
89 |
+
---
|
90 |
+
|
91 |
+
## 🤖 Hackathon Submission
|
92 |
+
|
93 |
+
- Track: `mcp-server-track`
|
94 |
+
- MCP-Enabled: ✅
|
95 |
+
- Repo/Space: [Dreamcatcher23/mcp-server-track](https://huggingface.co/spaces/Dreamcatcher23/mcp-server-track)
|
96 |
+
|
97 |
+
---
|
98 |
+
|
99 |
+
## 📚 References
|
100 |
+
|
101 |
+
- [Modal Docs](https://modal.com/docs)
|
102 |
+
- [Whisper GitHub](https://github.com/openai/whisper)
|
103 |
+
- [Gradio MCP Guide](https://huggingface.co/docs/hub/spaces-sse)
|
104 |
+
- [Agents Hackathon](https://huggingface.co/agents)
|
105 |
+
|
106 |
+
---
|
107 |
+
|
108 |
+
## 🧪 MCP Test Instructions
|
109 |
+
|
110 |
+
Use tools like [curl](https://curl.se/), Postman, or a Python client to test the API:
|
111 |
+
|
112 |
+
```bash
|
113 |
+
curl -X POST https://your-space-url/predict \
|
114 |
+
-H "Content-Type: application/json" \
|
115 |
+
-d '{"data":["https://huggingface.co/datasets/Narsil/asr_dummy/resolve/main/mlk.flac"]}'
|
116 |
+
````
|
117 |
+
|
118 |
+
---
|
119 |
+
|
120 |
+
## ✨ Built with ❤️ by Dreamcatcher23
|
121 |
+
|
122 |
+
```
|
123 |
+
|
124 |
+
---
|
125 |
+
|
126 |
+
Let me know if you'd like a badge version (e.g., Hugging Face badge, Modal run badge) or Markdown preview for your Hugging Face Space directly!
|
127 |
+
```
|
128 |
+
|