parakeet.js-demo / README.md
ysdede's picture
Update README.md
0c60506 verified
metadata
title: Parakeet.js Demo
emoji: 🦜
colorFrom: indigo
colorTo: blue
sdk: static
pinned: false
app_build_command: npm run build
app_file: build/index.html
license: mit
short_description: NVIDIA Parakeet speech recognition for the browser
models:
  - istupakov/parakeet-tdt-0.6b-v2-onnx
tags:
  - parakeet-js
  - parakeet
  - onnx
  - webgpu
  - asr
  - istupakov/parakeet-tdt-0.6b-v2-onnx
custom_headers:
  cross-origin-embedder-policy: require-corp
  cross-origin-opener-policy: same-origin
  cross-origin-resource-policy: cross-origin

🦜 Parakeet.js - HF Spaces Demo

NVIDIA Parakeet speech recognition for the browser using WebGPU/WASM

This demo showcases the parakeet.js library, which brings NVIDIA's Parakeet speech recognition models to the browser using ONNX Runtime Web with WebGPU and WASM backends.

πŸš€ Features

  • πŸ–₯️ Browser-based: Runs entirely in your browser - no server required
  • ⚑ WebGPU acceleration: Fast inference using WebGPU when available
  • πŸ”§ WASM fallback: CPU-based inference using WebAssembly
  • πŸ“± Multiple formats: Supports various audio formats (WAV, MP3, etc.)
  • 🎯 Real-time performance: Optimized for fast transcription
  • πŸ“Š Performance metrics: Shows detailed timing information
  • πŸŽ›οΈ Configurable: Adjustable quantization, preprocessing, and backend settings

πŸ”§ How to Use

  1. Click "Load Model" to download and initialize the speech recognition model
  2. Select your preferences:
    • Backend: Choose WebGPU (faster) or WASM (more compatible)
    • Quantization: fp32 (higher quality) or int8 (faster)
    • Preprocessor: Different audio processing options
  3. Upload an audio file using the file input
  4. View the transcription in real-time with performance metrics

πŸ“¦ Integration

You can use parakeet.js in your own projects:

npm install parakeet.js onnxruntime-web
import { ParakeetModel, getParakeetModel } from 'parakeet.js';

// Load model from HuggingFace Hub
const modelUrls = await getParakeetModel('istupakov/parakeet-tdt-0.6b-v2-onnx');
const model = await ParakeetModel.fromUrls(modelUrls);

// Transcribe audio
const result = await model.transcribe(audioData, sampleRate);
console.log(result.utterance_text);

πŸ”— Links

🧠 Model Information

This demo uses the istupakov/parakeet-tdt-0.6b-v2-onnx model, which is an ONNX-converted version of NVIDIA's Parakeet speech recognition model optimized for browser deployment.

πŸ’‘ Technical Details

  • Model Format: ONNX for cross-platform compatibility
  • Backends: WebGPU (GPU acceleration) and WASM (CPU fallback)
  • Quantization: Support for both fp32 and int8 precision
  • Audio Processing: Built-in preprocessing for various audio formats
  • Performance: Real-time factor (RTF) typically < 1.0x for fast transcription

Built with ❀️ using React and deployed on Hugging Face Spaces