qfuxa commited on
Commit
0912e2c
Β·
1 Parent(s): ef38b4d

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +20 -9
README.md CHANGED
@@ -2,15 +2,26 @@
2
 
3
  This fork of [Whisper Streaming](https://github.com/ufal/whisper_streaming) adds a ready-to-use HTML interface, making it super easy to start transcribing audio directly from your browser. Just launch the local server, allow microphone access, and start streaming. Everything runs locally on your machine πŸŽ™οΈβœ¨
4
 
5
- ## What's New?
6
- βœ… **Built-in Web UI** – Just open your browser and start transcribing, no need to build a frontend.
7
- βœ… **FastAPI Server with WebSocket Endpoint** – Enables real-time STT in browsers with async FFmpeg processing.
8
- βœ… **Buffering Preview** – Displays unvalidated buffer content for better streaming feedback.
9
- βœ… **Multiple Users Support** – The backend handles multiple users simultaneously without conflicts.
10
- βœ… **HTML - JavaScript Client Implementation** – A plug-and-play MediaRecorder setup for seamless client integration
11
- βœ… **MLX Whisper Backend** – Optimized Apple Silicon support for faster local processing.
12
- βœ… **Enhanced sentence segmentation** – Improves buffer trimming and sentence boundaries in certain languages
13
- βœ… **Diarization (Beta)** – Real-time speaker labeling using [Diart](https://github.com/juanmc2005/diart).
 
 
 
 
 
 
 
 
 
 
 
14
 
15
  <p align="center">
16
  <img src="src/web/demo.png" alt="Demo Screenshot" width="600">
 
2
 
3
  This fork of [Whisper Streaming](https://github.com/ufal/whisper_streaming) adds a ready-to-use HTML interface, making it super easy to start transcribing audio directly from your browser. Just launch the local server, allow microphone access, and start streaming. Everything runs locally on your machine πŸŽ™οΈβœ¨
4
 
5
+ ### What's New?
6
+
7
+ #### 🌐 **Web & API**
8
+ βœ… **Built-in Web UI** – No frontend setup needed, just open your browser and start transcribing.
9
+ βœ… **FastAPI WebSocket Server** – Real-time STT processing with async FFmpeg streaming.
10
+ βœ… **JavaScript Client** – A ready-to-use MediaRecorder implementation that can be copied on your client side.
11
+
12
+ #### βš™οΈ **Core Improvements**
13
+ βœ… **Buffering Preview** – Displays unvalidated transcription segments for better feedback.
14
+ βœ… **Multi-User Support** – Handle multiple users simultaneously without conflicts.
15
+ βœ… **MLX Whisper Backend** – Optimized for Apple Silicon for faster local processing.
16
+ βœ… **Enhanced Sentence Segmentation** – Better buffer trimming for better accuracy across languages.
17
+ βœ… **Extended Logging** – More detailed logs to improve debugging and monitoring.
18
+
19
+ #### πŸ”₯ **Advanced Features**
20
+ βœ… **Real-Time Diarization (Beta)** – Assigns speaker labels dynamically using [Diart](https://github.com/juanmc2005/diart).
21
+
22
+
23
+ ### Web UI
24
+
25
 
26
  <p align="center">
27
  <img src="src/web/demo.png" alt="Demo Screenshot" width="600">