qfuxa commited on
Commit
5355f8f
·
1 Parent(s): 6933483

diart link added

Browse files
Files changed (1) hide show
  1. README.md +8 -0
README.md CHANGED
@@ -12,6 +12,8 @@ This project extends the [Whisper Streaming](https://github.com/ufal/whisper_str
12
 
13
  5. **MLX Whisper backend**: Integrates the alternative backend option MLX Whisper, optimized for efficient speech recognition on Apple silicon.
14
 
 
 
15
  ![Demo Screenshot](src/web/demo.png)
16
 
17
  ## Code Origins
@@ -64,6 +66,10 @@ This project reuses and extends code from the original Whisper Streaming reposit
64
 
65
  # If you want to run the server using uvicorn (recommended)
66
  uvicorn
 
 
 
 
67
  ```
68
 
69
 
@@ -76,6 +82,8 @@ This project reuses and extends code from the original Whisper Streaming reposit
76
  - `--host` and `--port` let you specify the server’s IP/port.
77
  - `-min-chunk-size` sets the minimum chunk size for audio processing. Make sure this value aligns with the chunk size selected in the frontend. If not aligned, the system will work but may unnecessarily over-process audio data.
78
  - For a full list of configurable options, run `python whisper_fastapi_online_server.py -h`
 
 
79
 
80
  4. **Open the Provided HTML**:
81
 
 
12
 
13
  5. **MLX Whisper backend**: Integrates the alternative backend option MLX Whisper, optimized for efficient speech recognition on Apple silicon.
14
 
15
+ 6. **Diarization (beta)**: Adds speaker labeling in real-time alongside transcription using the [Diart](https://github.com/juanmc2005/diart) library. Each transcription segment is tagged with a speaker. Currently under active development.
16
+
17
  ![Demo Screenshot](src/web/demo.png)
18
 
19
  ## Code Origins
 
66
 
67
  # If you want to run the server using uvicorn (recommended)
68
  uvicorn
69
+
70
+ # If you want to use diarization
71
+ diart
72
+
73
  ```
74
 
75
 
 
82
  - `--host` and `--port` let you specify the server’s IP/port.
83
  - `-min-chunk-size` sets the minimum chunk size for audio processing. Make sure this value aligns with the chunk size selected in the frontend. If not aligned, the system will work but may unnecessarily over-process audio data.
84
  - For a full list of configurable options, run `python whisper_fastapi_online_server.py -h`
85
+ - `--diarization`, default to False, let you choose whether or not you want to run diarization in parallel
86
+ - For other parameters, look at [whisper streaming](https://github.com/ufal/whisper_streaming) readme.
87
 
88
  4. **Open the Provided HTML**:
89