Spaces:
Running
Running
File size: 3,187 Bytes
b566a39 edec527 2bf1a88 b566a39 0c60506 b566a39 edec527 aa6d1da 9dd4c4c 016a7d0 9dd4c4c 2b18779 9dd4c4c 2b18779 430cf67 b566a39 2bf1a88 b566a39 9dd4c4c b566a39 9dd4c4c b566a39 9dd4c4c b566a39 9dd4c4c b566a39 9dd4c4c b566a39 9dd4c4c b566a39 9dd4c4c b566a39 9dd4c4c b566a39 9dd4c4c b566a39 9dd4c4c b566a39 9dd4c4c 2b18779 9dd4c4c b566a39 9dd4c4c b566a39 9dd4c4c b566a39 9dd4c4c b566a39 9dd4c4c b566a39 2b18779 b566a39 9dd4c4c b566a39 9dd4c4c b566a39 9dd4c4c b566a39 0c60506 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 |
---
title: Parakeet.js Demo
emoji: π¦
colorFrom: indigo
colorTo: blue
sdk: static
pinned: false
app_build_command: npm run build
app_file: build/index.html
license: mit
short_description: NVIDIA Parakeet speech recognition for the browser
models:
- istupakov/parakeet-tdt-0.6b-v2-onnx
tags:
- parakeet-js
- parakeet
- onnx
- webgpu
- asr
- istupakov/parakeet-tdt-0.6b-v2-onnx
custom_headers:
cross-origin-embedder-policy: require-corp
cross-origin-opener-policy: same-origin
cross-origin-resource-policy: cross-origin
---
# π¦ Parakeet.js - HF Spaces Demo
> **NVIDIA Parakeet speech recognition for the browser using WebGPU/WASM**
This demo showcases the **[parakeet.js](https://www.npmjs.com/package/parakeet.js)** library, which brings NVIDIA's Parakeet speech recognition models to the browser using ONNX Runtime Web with WebGPU and WASM backends.
## π Features
- **π₯οΈ Browser-based**: Runs entirely in your browser - no server required
- **β‘ WebGPU acceleration**: Fast inference using WebGPU when available
- **π§ WASM fallback**: CPU-based inference using WebAssembly
- **π± Multiple formats**: Supports various audio formats (WAV, MP3, etc.)
- **π― Real-time performance**: Optimized for fast transcription
- **π Performance metrics**: Shows detailed timing information
- **ποΈ Configurable**: Adjustable quantization, preprocessing, and backend settings
## π§ How to Use
1. **Click "Load Model"** to download and initialize the speech recognition model
2. **Select your preferences**:
- **Backend**: Choose WebGPU (faster) or WASM (more compatible)
- **Quantization**: fp32 (higher quality) or int8 (faster)
- **Preprocessor**: Different audio processing options
3. **Upload an audio file** using the file input
4. **View the transcription** in real-time with performance metrics
## π¦ Integration
You can use parakeet.js in your own projects:
```bash
npm install parakeet.js onnxruntime-web
```
```javascript
import { ParakeetModel, getParakeetModel } from 'parakeet.js';
// Load model from HuggingFace Hub
const modelUrls = await getParakeetModel('istupakov/parakeet-tdt-0.6b-v2-onnx');
const model = await ParakeetModel.fromUrls(modelUrls);
// Transcribe audio
const result = await model.transcribe(audioData, sampleRate);
console.log(result.utterance_text);
```
## π Links
- **π [GitHub Repository](https://github.com/ysdede/parakeet.js)** - Source code and documentation
- **π¦ [npm Package](https://www.npmjs.com/package/parakeet.js)** - Install via npm
## π§ Model Information
This demo uses the **istupakov/parakeet-tdt-0.6b-v2-onnx** model, which is an ONNX-converted version of NVIDIA's Parakeet speech recognition model optimized for browser deployment.
## π‘ Technical Details
- **Model Format**: ONNX for cross-platform compatibility
- **Backends**: WebGPU (GPU acceleration) and WASM (CPU fallback)
- **Quantization**: Support for both fp32 and int8 precision
- **Audio Processing**: Built-in preprocessing for various audio formats
- **Performance**: Real-time factor (RTF) typically < 1.0x for fast transcription
---
*Built with β€οΈ using React and deployed on Hugging Face Spaces* |