Spaces:
Running
Running
Update index.html
Browse files- index.html +28 -6
index.html
CHANGED
@@ -178,13 +178,35 @@ flowchart TB
|
|
178 |
</div>
|
179 |
</section>
|
180 |
|
181 |
-
<!--
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
182 |
<section id="impact" class="mb-5">
|
183 |
-
<h2>
|
184 |
<p>
|
185 |
Each second saved in emergency care reduces mortality risk by ~7%. XTTVS-MED’s
|
186 |
-
200 ms
|
187 |
-
for non-native speakers.
|
188 |
</p>
|
189 |
<div class="row">
|
190 |
<div class="col-md-6">
|
@@ -197,7 +219,7 @@ flowchart TB
|
|
197 |
<p><strong>Dataset & Validation:</strong></p>
|
198 |
<ul>
|
199 |
<li>600 hrs multilingual clinical dialogues</li>
|
200 |
-
<li>ANOVA on MOS (p
|
201 |
<li>Speaker similarity ≥ 92%; intelligibility MOS ≥ 4.5/5</li>
|
202 |
</ul>
|
203 |
</div>
|
@@ -206,7 +228,7 @@ flowchart TB
|
|
206 |
|
207 |
<!-- BibTeX -->
|
208 |
<section id="bibtex" class="mb-5">
|
209 |
-
<h2>
|
210 |
<pre>@article{coleman2025xttvmed,
|
211 |
author = {Coleman, Chris and Becker, Anthony},
|
212 |
title = {XTTVS-MED: Real-Time Semantic 4-Bit Voice Cloning to Prevent Medical Miscommunication},
|
|
|
178 |
</div>
|
179 |
</section>
|
180 |
|
181 |
+
<!-- Translation + Quick LoRA -->
|
182 |
+
<section id="translation" class="mb-5">
|
183 |
+
<h2>4. Translation & Quick LoRA Epoch Training</h2>
|
184 |
+
<p>
|
185 |
+
XTTVS-MED auto-detects 50+ languages in ≤200 ms via an acoustic n-gram classifier.
|
186 |
+
For unsupported dialects, a <strong>quick LoRA epoch</strong>—using 1–2 hrs of local audio—adapts the base model in under 30 minutes.
|
187 |
+
</p>
|
188 |
+
<div class="diagram mermaid">
|
189 |
+
flowchart LR
|
190 |
+
D["Dialect Audio (1–2 hrs)"]
|
191 |
+
--> P["Preprocess & Align"]
|
192 |
+
--> T["Train LoRA Epoch<br/>(5–10 epochs)"]
|
193 |
+
--> U["Updated Adapters"]
|
194 |
+
--> M["Inference Pipeline"]
|
195 |
+
</div>
|
196 |
+
<ul>
|
197 |
+
<li><strong>Step 1:</strong> Record ~1 hr of target dialect speech.</li>
|
198 |
+
<li><strong>Step 2:</strong> Extract Mel-spectrograms, align to transcripts.</li>
|
199 |
+
<li><strong>Step 3:</strong> Train LoRA adapters for speaker + dialect (5–10 epochs, 30 min).</li>
|
200 |
+
<li><strong>Step 4:</strong> Deploy updated adapters; new dialect instantaneously available.</li>
|
201 |
+
</ul>
|
202 |
+
</section>
|
203 |
+
|
204 |
+
<!-- Clinical Impact -->
|
205 |
<section id="impact" class="mb-5">
|
206 |
+
<h2>5. Clinical Impact & Data Science</h2>
|
207 |
<p>
|
208 |
Each second saved in emergency care reduces mortality risk by ~7%. XTTVS-MED’s
|
209 |
+
200 ms detection + <1 s synthesis can improve survival by 10–15% for non-native speakers.
|
|
|
210 |
</p>
|
211 |
<div class="row">
|
212 |
<div class="col-md-6">
|
|
|
219 |
<p><strong>Dataset & Validation:</strong></p>
|
220 |
<ul>
|
221 |
<li>600 hrs multilingual clinical dialogues</li>
|
222 |
+
<li>ANOVA on MOS (p < 0.01)</li>
|
223 |
<li>Speaker similarity ≥ 92%; intelligibility MOS ≥ 4.5/5</li>
|
224 |
</ul>
|
225 |
</div>
|
|
|
228 |
|
229 |
<!-- BibTeX -->
|
230 |
<section id="bibtex" class="mb-5">
|
231 |
+
<h2>6. BibTeX</h2>
|
232 |
<pre>@article{coleman2025xttvmed,
|
233 |
author = {Coleman, Chris and Becker, Anthony},
|
234 |
title = {XTTVS-MED: Real-Time Semantic 4-Bit Voice Cloning to Prevent Medical Miscommunication},
|