Spaces:
Runtime error
Runtime error
File size: 3,624 Bytes
d8e91bb a0e01a7 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 |
---
title: Mcp Server Track
emoji: π₯
colorFrom: blue
colorTo: yellow
sdk: gradio
sdk_version: 5.33.1
app_file: app.py
pinned: false
license: mit
short_description: Gradio Whisper Transcription App (MCP)
---
Here's a complete `README.md` tailored for your project hosted at [https://huggingface.co/spaces/Dreamcatcher23/mcp-server-track](https://huggingface.co/spaces/Dreamcatcher23/mcp-server-track). It assumes you're offering an audio transcription service using OpenAI's Whisper via Modal and exposing it as an MCP-compliant server using Gradio:
---
```markdown
# ποΈ Whisper Transcription Tool β MCP Server
Welcome to the **MCP Server Track** submission for the Hugging Face & OpenAI Agents Hackathon!
This tool provides speech-to-text transcription using OpenAIβs Whisper model, deployed via [Modal](https://modal.com/), and exposes the service through a Gradio interface that supports the **Model Context Protocol (MCP)**. Agents can access this tool via HTTP or streaming endpoints.
---
## π§ What It Does
- Accepts an audio URL (MP3, WAV, FLAC, etc.)
- Transcribes the speech to text using Whisper (`base` model)
- Returns clean, readable output
- Exposes an MCP-compliant API endpoint
---
## π How to Use
### πΉ Web Interface (UI)
1. Enter an audio URL (e.g., a `.flac` or `.wav` file).
2. Click the **Submit** button.
3. View the transcribed text output instantly.
### πΉ As an MCP Tool (Programmatic Access)
This app can be invoked by agents (e.g., SmolAI, LangChain, or custom agent scripts) using the MCP specification.
- Endpoint: `/predict`
- Method: POST
- Input Schema: `{ "data": [ "AUDIO_URL_HERE" ] }`
- Output Schema: `{ "data": [ "TRANSCRIPTION_TEXT" ] }`
---
## π οΈ Stack
| Layer | Tech |
|--------------|-------------------------------------|
| Frontend | [Gradio](https://gradio.app/) |
| Inference | [OpenAI Whisper](https://github.com/openai/whisper) |
| Hosting | [Hugging Face Spaces](https://huggingface.co/spaces) |
| Remote Compute | [Modal](https://modal.com/) |
| Protocol | Model Context Protocol (MCP) |
---
## π¦ Example Input
```
[https://huggingface.co/datasets/Narsil/asr\_dummy/resolve/main/mlk.flac](https://huggingface.co/datasets/Narsil/asr_dummy/resolve/main/mlk.flac)
```
Expected output:
```
I have a dream that one day this nation will rise up and live out the true meaning of its creed.
````
---
## π Notes
- Supports English only (for now).
- Long-form audio and larger models (e.g., `medium`, `large`) can be added later.
- You can extend it to support file uploads or streaming audio.
---
## π€ Hackathon Submission
- Track: `mcp-server-track`
- MCP-Enabled: β
- Repo/Space: [Dreamcatcher23/mcp-server-track](https://huggingface.co/spaces/Dreamcatcher23/mcp-server-track)
---
## π References
- [Modal Docs](https://modal.com/docs)
- [Whisper GitHub](https://github.com/openai/whisper)
- [Gradio MCP Guide](https://huggingface.co/docs/hub/spaces-sse)
- [Agents Hackathon](https://huggingface.co/agents)
---
## π§ͺ MCP Test Instructions
Use tools like [curl](https://curl.se/), Postman, or a Python client to test the API:
```bash
curl -X POST https://your-space-url/predict \
-H "Content-Type: application/json" \
-d '{"data":["https://huggingface.co/datasets/Narsil/asr_dummy/resolve/main/mlk.flac"]}'
````
---
## β¨ Built with β€οΈ by Dreamcatcher23
```
---
Let me know if you'd like a badge version (e.g., Hugging Face badge, Modal run badge) or Markdown preview for your Hugging Face Space directly!
```
|