ghostai1 commited on
Commit
376cb8f
·
verified ·
1 Parent(s): 6329596

Update index.html

Browse files
Files changed (1) hide show
  1. index.html +28 -6
index.html CHANGED
@@ -178,13 +178,35 @@ flowchart TB
178
  </div>
179
  </section>
180
 
181
- <!-- Impact -->
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
182
  <section id="impact" class="mb-5">
183
- <h2>4. Clinical Impact & Data Science</h2>
184
  <p>
185
  Each second saved in emergency care reduces mortality risk by ~7%. XTTVS-MED’s
186
- 200 ms language detection + sub-1 s synthesis can improve survival by 10–15%
187
- for non-native speakers.
188
  </p>
189
  <div class="row">
190
  <div class="col-md-6">
@@ -197,7 +219,7 @@ flowchart TB
197
  <p><strong>Dataset & Validation:</strong></p>
198
  <ul>
199
  <li>600 hrs multilingual clinical dialogues</li>
200
- <li>ANOVA on MOS (p < 0.01)</li>
201
  <li>Speaker similarity ≥ 92%; intelligibility MOS ≥ 4.5/5</li>
202
  </ul>
203
  </div>
@@ -206,7 +228,7 @@ flowchart TB
206
 
207
  <!-- BibTeX -->
208
  <section id="bibtex" class="mb-5">
209
- <h2>5. BibTeX</h2>
210
  <pre>@article{coleman2025xttvmed,
211
  author = {Coleman, Chris and Becker, Anthony},
212
  title = {XTTVS-MED: Real-Time Semantic 4-Bit Voice Cloning to Prevent Medical Miscommunication},
 
178
  </div>
179
  </section>
180
 
181
+ <!-- Translation + Quick LoRA -->
182
+ <section id="translation" class="mb-5">
183
+ <h2>4. Translation & Quick LoRA Epoch Training</h2>
184
+ <p>
185
+ XTTVS-MED auto-detects 50+ languages in ≤200 ms via an acoustic n-gram classifier.
186
+ For unsupported dialects, a <strong>quick LoRA epoch</strong>—using 1–2 hrs of local audio—adapts the base model in under 30 minutes.
187
+ </p>
188
+ <div class="diagram mermaid">
189
+ flowchart LR
190
+ D["Dialect Audio (1–2 hrs)"]
191
+ --> P["Preprocess & Align"]
192
+ --> T["Train LoRA Epoch<br/>(5–10 epochs)"]
193
+ --> U["Updated Adapters"]
194
+ --> M["Inference Pipeline"]
195
+ </div>
196
+ <ul>
197
+ <li><strong>Step 1:</strong> Record ~1 hr of target dialect speech.</li>
198
+ <li><strong>Step 2:</strong> Extract Mel-spectrograms, align to transcripts.</li>
199
+ <li><strong>Step 3:</strong> Train LoRA adapters for speaker + dialect (5–10 epochs, 30 min).</li>
200
+ <li><strong>Step 4:</strong> Deploy updated adapters; new dialect instantaneously available.</li>
201
+ </ul>
202
+ </section>
203
+
204
+ <!-- Clinical Impact -->
205
  <section id="impact" class="mb-5">
206
+ <h2>5. Clinical Impact & Data Science</h2>
207
  <p>
208
  Each second saved in emergency care reduces mortality risk by ~7%. XTTVS-MED’s
209
+ 200 ms detection + <1 s synthesis can improve survival by 10–15% for non-native speakers.
 
210
  </p>
211
  <div class="row">
212
  <div class="col-md-6">
 
219
  <p><strong>Dataset & Validation:</strong></p>
220
  <ul>
221
  <li>600 hrs multilingual clinical dialogues</li>
222
+ <li>ANOVA on MOS (p &lt; 0.01)</li>
223
  <li>Speaker similarity ≥ 92%; intelligibility MOS ≥ 4.5/5</li>
224
  </ul>
225
  </div>
 
228
 
229
  <!-- BibTeX -->
230
  <section id="bibtex" class="mb-5">
231
+ <h2>6. BibTeX</h2>
232
  <pre>@article{coleman2025xttvmed,
233
  author = {Coleman, Chris and Becker, Anthony},
234
  title = {XTTVS-MED: Real-Time Semantic 4-Bit Voice Cloning to Prevent Medical Miscommunication},