Spaces:
Running
Running
Update README.md
Browse files
README.md
CHANGED
|
@@ -6,22 +6,44 @@ colorTo: indigo
|
|
| 6 |
sdk: static
|
| 7 |
pinned: false
|
| 8 |
license: apache-2.0
|
| 9 |
-
short_description: My First
|
| 10 |
---
|
| 11 |
|
| 12 |
-
#
|
| 13 |
|
| 14 |
-
|
| 15 |
|
| 16 |
-
|
| 17 |
-
```
|
| 18 |
-
@article{park2021nerfies
|
| 19 |
-
author = {Park, Keunhong and Sinha, Utkarsh and Barron, Jonathan T. and Bouaziz, Sofien and Goldman, Dan B and Seitz, Steven M. and Martin-Brualla, Ricardo},
|
| 20 |
-
title = {Nerfies: Deformable Neural Radiance Fields},
|
| 21 |
-
journal = {ICCV},
|
| 22 |
-
year = {2021},
|
| 23 |
-
}
|
| 24 |
-
```
|
| 25 |
|
| 26 |
-
|
| 27 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 6 |
sdk: static
|
| 7 |
pinned: false
|
| 8 |
license: apache-2.0
|
| 9 |
+
short_description: My First Binary Tree into AI
|
| 10 |
---
|
| 11 |
|
| 12 |
+
# GHOSTVOICECBR
|
| 13 |
|
| 14 |
+
**GHOSTVOICECBR** is a real-time voice cloning framework built using a novel Case-Based Reasoning (CBR) binary tree and quadrant-based tone mapping system. It enables low-latency, agentic voice synthesis using emotional quads β pitch, timbre, speed, and mood β mapped onto cloned neural speech vectors.
|
| 15 |
|
| 16 |
+
This system is the voice synthesis engine behind the **GhostVoice AI project**, capable of generating cloned speech in real time, optimized for 8β12 GB VRAM and deployable via Hugging Face Spaces, Gradio, or Twitch integrations.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 17 |
|
| 18 |
+
---
|
| 19 |
+
|
| 20 |
+
## β¨ Features
|
| 21 |
+
|
| 22 |
+
- β
**CBR Tree Traversal** for selecting voice tones over time
|
| 23 |
+
- β
**Quad Mapping Engine** for emotion-driven synthesis
|
| 24 |
+
- β
**Fast Inference** on local hardware (no cloud GPU required)
|
| 25 |
+
- β
**Speaker Embedding Support** for personalized cloning
|
| 26 |
+
- β
**Live Gradio UI** with console, waveform, and quad controls
|
| 27 |
+
- β
**Twitch VoiceBot Ready** (optional module)
|
| 28 |
+
|
| 29 |
+
---
|
| 30 |
+
|
| 31 |
+
## π§ Core Concepts
|
| 32 |
+
|
| 33 |
+
- **CBR Binary Tree**: Stores and retrieves historical tone vectors efficiently.
|
| 34 |
+
- **Quad Mapping**: Each speech sample is mapped using a 4D vector:
|
| 35 |
+
- `pitch`
|
| 36 |
+
- `speed`
|
| 37 |
+
- `timbre`
|
| 38 |
+
- `emotion`
|
| 39 |
+
- **Voice Matching**: Nearest neighbor match + synthetic generation
|
| 40 |
+
- **Open Format**: Easily extensible to other TTS models or APIs (Bark, MusicGen, etc.)
|
| 41 |
+
|
| 42 |
+
---
|
| 43 |
+
|
| 44 |
+
## π How to Use
|
| 45 |
+
|
| 46 |
+
1. Clone or fork this repo
|
| 47 |
+
2. Install dependencies:
|
| 48 |
+
```bash
|
| 49 |
+
pip install -r requirements.txt
|