Spaces:

ghostai1
/

GHOSTVOICECBR

Running

App Files Files Community

ghostai1 commited on May 24

Commit

376cb8f

verified ·

1 Parent(s): 6329596

Update index.html

Browse files

Files changed (1) hide show

index.html +28 -6

index.html CHANGED Viewed

@@ -178,13 +178,35 @@ flowchart TB
       </div>
     </section>
-    <!-- Impact -->
     <section id="impact" class="mb-5">
-      <h2>4. Clinical Impact & Data Science</h2>
       <p>
         Each second saved in emergency care reduces mortality risk by ~7%. XTTVS-MED’s
-        200 ms language detection + sub-1 s synthesis can improve survival by 10–15%
-        for non-native speakers.
       </p>
       <div class="row">
         <div class="col-md-6">
@@ -197,7 +219,7 @@ flowchart TB
           <p><strong>Dataset & Validation:</strong></p>
           <ul>
             <li>600 hrs multilingual clinical dialogues</li>
-            <li>ANOVA on MOS (p < 0.01)</li>
             <li>Speaker similarity ≥ 92%; intelligibility MOS ≥ 4.5/5</li>
           </ul>
         </div>
@@ -206,7 +228,7 @@ flowchart TB
     <!-- BibTeX -->
     <section id="bibtex" class="mb-5">
-      <h2>5. BibTeX</h2>
       <pre>@article{coleman2025xttvmed,
   author    = {Coleman, Chris and Becker, Anthony},
   title     = {XTTVS-MED: Real-Time Semantic 4-Bit Voice Cloning to Prevent Medical Miscommunication},

       </div>
     </section>
+    <!-- Translation + Quick LoRA -->
+    <section id="translation" class="mb-5">
+      <h2>4. Translation & Quick LoRA Epoch Training</h2>
+      <p>
+        XTTVS-MED auto-detects 50+ languages in ≤200 ms via an acoustic n-gram classifier.
+        For unsupported dialects, a <strong>quick LoRA epoch</strong>—using 1–2 hrs of local audio—adapts the base model in under 30 minutes.
+      </p>
+      <div class="diagram mermaid">
+flowchart LR
+  D["Dialect Audio (1–2 hrs)"]
+  --> P["Preprocess & Align"]
+  --> T["Train LoRA Epoch<br/>(5–10 epochs)"]
+  --> U["Updated Adapters"]
+  --> M["Inference Pipeline"]
+      </div>
+      <ul>
+        <li><strong>Step 1:</strong> Record ~1 hr of target dialect speech.</li>
+        <li><strong>Step 2:</strong> Extract Mel-spectrograms, align to transcripts.</li>
+        <li><strong>Step 3:</strong> Train LoRA adapters for speaker + dialect (5–10 epochs, 30 min).</li>
+        <li><strong>Step 4:</strong> Deploy updated adapters; new dialect instantaneously available.</li>
+      </ul>
+    </section>
+    <!-- Clinical Impact -->
     <section id="impact" class="mb-5">
+      <h2>5. Clinical Impact & Data Science</h2>
       <p>
         Each second saved in emergency care reduces mortality risk by ~7%. XTTVS-MED’s
+        200 ms detection +  <1 s synthesis can improve survival by 10–15% for non-native speakers.
       </p>
       <div class="row">
         <div class="col-md-6">
           <p><strong>Dataset & Validation:</strong></p>
           <ul>
             <li>600 hrs multilingual clinical dialogues</li>
+            <li>ANOVA on MOS (p &lt; 0.01)</li>
             <li>Speaker similarity ≥ 92%; intelligibility MOS ≥ 4.5/5</li>
           </ul>
         </div>
     <!-- BibTeX -->
     <section id="bibtex" class="mb-5">
+      <h2>6. BibTeX</h2>
       <pre>@article{coleman2025xttvmed,
   author    = {Coleman, Chris and Becker, Anthony},
   title     = {XTTVS-MED: Real-Time Semantic 4-Bit Voice Cloning to Prevent Medical Miscommunication},