tommytracx commited on
Commit
5ace8a9
·
verified ·
1 Parent(s): 588b9b6

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +56 -6
README.md CHANGED
@@ -1,9 +1,59 @@
1
  # AGI Telecom POC
2
 
3
- This is a full stack voice interface system powered by LLM, STT, TTS, and WebRTC-ready frontend.
4
 
5
- ## Quick Start
6
- ```bash
7
- pip install -r requirements.txt
8
- uvicorn app.main:app --reload
9
- ```
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  # AGI Telecom POC
2
 
3
+ This Hugging Face Space demonstrates an AGI-powered telecom interface that enables voice and text interaction through telecommunication channels (WebRTC/SIP).
4
 
5
+ ## Overview
6
+
7
+ This proof-of-concept showcases how AI assistants can be delivered through telecom infrastructure with:
8
+
9
+ - Multimodal communication (voice + text)
10
+ - Agentic intelligence (reasoning, memory)
11
+ - Telecom-enabled delivery
12
+
13
+ ## Demo Usage
14
+
15
+ This space provides two ways to interact with the system:
16
+
17
+ 1. **Gradio Interface**: A simplified interface that demonstrates core functionality
18
+ - Upload audio or use text input
19
+ - Get transcriptions, agent responses, and speech synthesis
20
+ - Manage conversation sessions
21
+
22
+ 2. **API Endpoints**: Direct API access for more advanced integration
23
+ - `/api/transcribe` - Convert audio to text
24
+ - `/api/query` - Process text with agent
25
+ - `/api/speak` - Convert text to speech
26
+ - `/api/session` - Create new conversation sessions
27
+
28
+ ## Architecture
29
+
30
+ The system follows this processing flow:
31
+
32
+ ```
33
+ [User Voice Input] → [Speech-to-Text] → [Agent Reasoning] → [Text-to-Speech Output] → [Telecom Network Delivery]
34
+ ```
35
+
36
+ ## Local Development
37
+
38
+ To run this project locally:
39
+
40
+ 1. Clone the repository
41
+ 2. Install dependencies: `pip install -r requirements.txt`
42
+ 3. Run the app: `python app.py`
43
+ 4. Open http://localhost:8000 in your browser
44
+
45
+ ## Notes
46
+
47
+ - This demo uses simplified mock implementations
48
+ - For production use, you would replace the mock functions with:
49
+ - Whisper for speech-to-text
50
+ - A proper LLM (like LLAMA, Mistral) for reasoning
51
+ - A high-quality TTS engine
52
+ - Full WebRTC/SIP implementation
53
+
54
+ ## Future Extensions
55
+
56
+ - Full SIP integration
57
+ - Mesh networking with fallback intelligence
58
+ - Enhanced multi-agent collaboration
59
+ - Advanced contextual reasoning