Spaces:
Running
Running
Commit
·
a8b5cb5
1
Parent(s):
115b95d
Upd chat-history README
Browse files- chat-history.md +277 -158
chat-history.md
CHANGED
@@ -1,161 +1,231 @@
|
|
1 |
-
# 🔄 Hybrid Context Retrieval
|
2 |
|
3 |
## Overview
|
4 |
|
5 |
-
The Medical Chatbot now implements
|
6 |
|
7 |
## 🏗️ Architecture
|
8 |
|
9 |
-
###
|
10 |
```
|
11 |
-
User Query →
|
12 |
-
```
|
13 |
-
|
14 |
-
### After (Hybrid Approach)
|
15 |
-
```
|
16 |
-
User Query → Hybrid Context Retrieval → Intelligent Context Selection → LLM Response
|
17 |
↓
|
18 |
-
|
19 |
-
│ RAG Search │
|
20 |
-
│ (Semantic)
|
21 |
-
|
22 |
↓
|
23 |
Gemini Flash Lite Contextual Analysis
|
24 |
↓
|
25 |
-
|
26 |
```
|
27 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
28 |
## 🔧 Key Components
|
29 |
|
30 |
-
### 1. Memory Manager (`memory.py`)
|
31 |
|
32 |
-
####
|
33 |
```python
|
34 |
-
def get_recent_chat_history(self, user_id: str, num_turns: int =
|
35 |
"""
|
36 |
-
Get the most recent
|
37 |
-
Returns: [{"user": "
|
38 |
"""
|
39 |
```
|
40 |
|
41 |
-
**Features:**
|
42 |
-
-
|
43 |
-
-
|
44 |
-
-
|
45 |
-
-
|
46 |
|
47 |
-
####
|
48 |
-
|
49 |
-
|
50 |
-
|
|
|
51 |
|
52 |
-
|
|
|
|
|
|
|
|
|
53 |
|
54 |
-
####
|
55 |
```python
|
56 |
-
def
|
57 |
-
|
58 |
-
|
|
|
|
|
59 |
```
|
60 |
|
61 |
-
**
|
62 |
-
-
|
63 |
-
-
|
64 |
-
-
|
65 |
-
-
|
66 |
|
67 |
-
|
68 |
|
69 |
-
|
70 |
```python
|
71 |
-
|
72 |
-
|
73 |
-
|
|
|
|
|
74 |
```
|
75 |
|
76 |
-
|
77 |
-
|
|
|
|
|
|
|
78 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
79 |
```
|
80 |
-
You are a medical assistant analyzing conversation context to provide relevant information.
|
81 |
|
82 |
-
|
|
|
|
|
|
|
83 |
|
84 |
-
|
85 |
-
|
86 |
|
87 |
-
|
|
|
|
|
88 |
|
89 |
-
|
90 |
-
|
91 |
-
2. Is the user referencing a previous diagnosis or recommendation?
|
92 |
-
3. Are there any follow-up questions that build on previous responses?
|
93 |
-
4. Which chunks provide the most relevant medical information for the current query?
|
94 |
|
95 |
-
Output: Return only the most relevant context chunks that should be included in the response.
|
96 |
```
|
|
|
97 |
|
98 |
-
|
99 |
-
Gemini Flash Lite analyzes the query and selects relevant context from:
|
100 |
-
- **Recent conversations** (for continuity)
|
101 |
-
- **Semantic chunks** (for topic relevance)
|
102 |
-
- **Combined insights** (for comprehensive understanding)
|
103 |
|
104 |
-
|
105 |
-
|
106 |
-
|
107 |
-
- **Conversationally continuous** (from recent history)
|
108 |
|
109 |
-
|
|
|
110 |
|
111 |
-
|
112 |
-
- Handles follow-up questions naturally
|
113 |
-
- Maintains context across multiple exchanges
|
114 |
-
- Understands references to previous responses
|
115 |
|
116 |
-
|
117 |
-
|
118 |
-
|
119 |
-
|
|
|
|
|
120 |
|
121 |
-
|
122 |
-
|
123 |
-
- If RAG fails, falls back to recent history
|
124 |
-
- Ensures system reliability
|
125 |
|
126 |
-
### 4
|
127 |
-
|
128 |
-
-
|
129 |
-
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
130 |
|
131 |
## 🧪 Example Scenarios
|
132 |
|
133 |
-
### Scenario 1:
|
134 |
```
|
135 |
-
User: "I have
|
136 |
-
Bot: "This could be
|
137 |
|
138 |
-
User: "What
|
139 |
-
Bot: "
|
140 |
|
141 |
-
User: "
|
142 |
-
Bot: "
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
143 |
```
|
144 |
-
**Result:**
|
145 |
|
146 |
-
### Scenario
|
147 |
```
|
148 |
-
User: "
|
149 |
-
Bot: "
|
150 |
```
|
151 |
-
**Result:**
|
152 |
|
153 |
-
### Scenario
|
154 |
```
|
155 |
-
User: "
|
156 |
-
Bot: "
|
157 |
```
|
158 |
-
**Result:**
|
159 |
|
160 |
## ⚙️ Configuration
|
161 |
|
@@ -164,100 +234,149 @@ Bot: "Let me clarify the prevention steps I mentioned earlier..."
|
|
164 |
FlashAPI=your_gemini_api_key # For both main LLM and contextual analysis
|
165 |
```
|
166 |
|
167 |
-
### Memory Settings
|
168 |
```python
|
169 |
memory = MemoryManager(
|
170 |
max_users=1000, # Maximum users in memory
|
171 |
-
history_per_user=
|
172 |
-
max_chunks=
|
173 |
)
|
174 |
```
|
175 |
|
176 |
-
###
|
177 |
```python
|
178 |
-
#
|
179 |
-
recent_history = memory.get_recent_chat_history(user_id, num_turns=
|
180 |
|
181 |
-
#
|
182 |
rag_chunks = memory.get_relevant_chunks(user_id, query, top_k=3, min_sim=0.30)
|
183 |
|
184 |
-
#
|
185 |
-
|
186 |
-
|
187 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
188 |
```
|
189 |
|
190 |
## 🔍 Monitoring & Debugging
|
191 |
|
192 |
-
### Logging
|
193 |
-
The system provides comprehensive logging:
|
194 |
```python
|
195 |
-
|
196 |
-
logger.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
197 |
```
|
198 |
|
199 |
### Performance Metrics
|
200 |
-
-
|
201 |
-
-
|
202 |
-
-
|
|
|
203 |
|
204 |
## 🚨 Error Handling
|
205 |
|
206 |
-
### Fallback Strategy
|
207 |
-
1. **Primary:** Gemini Flash Lite contextual
|
208 |
-
2. **Secondary:**
|
209 |
-
3. **Tertiary:**
|
210 |
-
4. **Final:**
|
|
|
211 |
|
212 |
-
### Error Scenarios
|
213 |
-
- Gemini API failure → Fall back to
|
214 |
-
-
|
215 |
-
-
|
|
|
|
|
216 |
|
217 |
## 🔮 Future Enhancements
|
218 |
|
219 |
-
### 1. **
|
220 |
-
-
|
221 |
-
-
|
222 |
-
-
|
223 |
-
|
224 |
-
|
225 |
-
|
226 |
-
-
|
227 |
-
-
|
228 |
-
|
229 |
-
|
230 |
-
|
231 |
-
|
232 |
-
-
|
233 |
-
|
234 |
-
|
235 |
-
-
|
236 |
-
|
237 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
238 |
|
239 |
## 📝 Testing
|
240 |
|
241 |
-
|
242 |
```bash
|
243 |
cd Medical-Chatbot
|
244 |
-
python
|
245 |
```
|
246 |
|
247 |
-
|
248 |
-
|
249 |
-
|
250 |
-
|
251 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
252 |
|
253 |
## 🎯 Summary
|
254 |
|
255 |
-
The
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
256 |
|
257 |
-
|
258 |
-
|
259 |
-
|
260 |
-
|
261 |
-
|
262 |
|
263 |
-
|
|
|
1 |
+
# 🔄 Enhanced Memory System: STM + LTM + Hybrid Context Retrieval
|
2 |
|
3 |
## Overview
|
4 |
|
5 |
+
The Medical Chatbot now implements an **advanced memory system** with **Short-Term Memory (STM)** and **Long-Term Memory (LTM)** that intelligently manages conversation context, semantic knowledge, and conversational continuity. This system goes beyond simple RAG to provide truly intelligent, contextually aware responses that remember and build upon previous interactions.
|
6 |
|
7 |
## 🏗️ Architecture
|
8 |
|
9 |
+
### Memory Hierarchy
|
10 |
```
|
11 |
+
User Query → Enhanced Memory System → Intelligent Context Selection → LLM Response
|
|
|
|
|
|
|
|
|
|
|
12 |
↓
|
13 |
+
┌─────────────────┬─────────────────┬─────────────────┐
|
14 |
+
│ STM (5 items) │ LTM (60 items)│ RAG Search │
|
15 |
+
│ (Recent Summaries)│ (Semantic Store)│ (Knowledge Base)│
|
16 |
+
└─────────────────┴─────────────────┴─────────────────┘
|
17 |
↓
|
18 |
Gemini Flash Lite Contextual Analysis
|
19 |
↓
|
20 |
+
Summarized Context + Semantic Knowledge
|
21 |
```
|
22 |
|
23 |
+
### Memory Types
|
24 |
+
|
25 |
+
#### 1. **Short-Term Memory (STM)**
|
26 |
+
- **Capacity:** 5 recent conversation summaries
|
27 |
+
- **Content:** Chunked and summarized LLM responses with enriched topics
|
28 |
+
- **Features:** Semantic deduplication, intelligent merging, topic enrichment
|
29 |
+
- **Purpose:** Maintain conversational continuity and immediate context
|
30 |
+
|
31 |
+
#### 2. **Long-Term Memory (LTM)**
|
32 |
+
- **Capacity:** 60 semantic chunks (~20 conversational rounds)
|
33 |
+
- **Content:** FAISS-indexed medical knowledge chunks
|
34 |
+
- **Features:** Semantic similarity search, usage tracking, smart eviction
|
35 |
+
- **Purpose:** Provide deep medical knowledge and historical context
|
36 |
+
|
37 |
+
#### 3. **RAG Knowledge Base**
|
38 |
+
- **Content:** External medical knowledge and guidelines
|
39 |
+
- **Features:** Real-time retrieval, semantic matching
|
40 |
+
- **Purpose:** Supplement with current medical information
|
41 |
+
|
42 |
## 🔧 Key Components
|
43 |
|
44 |
+
### 1. Enhanced Memory Manager (`memory.py`)
|
45 |
|
46 |
+
#### STM Management
|
47 |
```python
|
48 |
+
def get_recent_chat_history(self, user_id: str, num_turns: int = 5) -> List[Dict]:
|
49 |
"""
|
50 |
+
Get the most recent STM summaries (not raw Q/A).
|
51 |
+
Returns: [{"user": "", "bot": "Topic: ...\n<summary>", "timestamp": time}, ...]
|
52 |
"""
|
53 |
```
|
54 |
|
55 |
+
**STM Features:**
|
56 |
+
- **Capacity:** 5 recent conversation summaries
|
57 |
+
- **Content:** Chunked and summarized LLM responses with enriched topics
|
58 |
+
- **Deduplication:** Semantic similarity-based merging (≥0.92 identical, ≥0.75 merge)
|
59 |
+
- **Topic Enrichment:** Uses user question context to generate detailed topics
|
60 |
|
61 |
+
#### LTM Management
|
62 |
+
```python
|
63 |
+
def get_relevant_chunks(self, user_id: str, query: str, top_k: int = 3, min_sim: float = 0.30) -> List[str]:
|
64 |
+
"""Return texts of chunks whose cosine similarity ≥ min_sim."""
|
65 |
+
```
|
66 |
|
67 |
+
**LTM Features:**
|
68 |
+
- **Capacity:** 60 semantic chunks (~20 conversational rounds)
|
69 |
+
- **Indexing:** FAISS-based semantic search
|
70 |
+
- **Smart Eviction:** Usage-based decay and recency scoring
|
71 |
+
- **Merging:** Intelligent deduplication and content fusion
|
72 |
|
73 |
+
#### Enhanced Chunking
|
74 |
```python
|
75 |
+
def chunk_response(self, response: str, lang: str, question: str = "") -> List[Dict]:
|
76 |
+
"""
|
77 |
+
Enhanced chunking with question context for richer topics.
|
78 |
+
Returns: [{"tag": "detailed_topic", "text": "summary"}, ...]
|
79 |
+
"""
|
80 |
```
|
81 |
|
82 |
+
**Chunking Features:**
|
83 |
+
- **Question Context:** Incorporates user's latest question for topic generation
|
84 |
+
- **Rich Topics:** Detailed topics (10-20 words) capturing context, condition, and action
|
85 |
+
- **Medical Focus:** Excludes disclaimers, includes exact medication names/doses
|
86 |
+
- **Semantic Grouping:** Groups by medical topic, symptom, assessment, plan, or instruction
|
87 |
|
88 |
+
### 2. Intelligent Context Retrieval
|
89 |
|
90 |
+
#### Contextual Summarization
|
91 |
```python
|
92 |
+
def get_contextual_chunks(self, user_id: str, current_query: str, lang: str = "EN") -> str:
|
93 |
+
"""
|
94 |
+
Creates a single, coherent summary from STM + LTM + RAG.
|
95 |
+
Returns: A single summary string for the main LLM.
|
96 |
+
"""
|
97 |
```
|
98 |
|
99 |
+
**Features:**
|
100 |
+
- **Unified Summary:** Combines STM (5 turns) + LTM (semantic) + RAG (knowledge)
|
101 |
+
- **Gemini Analysis:** Uses Gemini Flash Lite for intelligent context selection
|
102 |
+
- **Conversational Flow:** Maintains continuity while providing medical relevance
|
103 |
+
- **Fallback Strategy:** Graceful degradation if analysis fails
|
104 |
|
105 |
+
## 🚀 How It Works
|
106 |
+
|
107 |
+
### Step 1: Enhanced Memory Processing
|
108 |
+
```python
|
109 |
+
# Process new exchange through STM and LTM
|
110 |
+
chunks = memory.chunk_response(response, lang, question=query)
|
111 |
+
for chunk in chunks:
|
112 |
+
memory._upsert_stm(user_id, chunk, lang) # STM with dedupe/merge
|
113 |
+
memory._upsert_ltm(user_id, chunks, lang) # LTM with semantic storage
|
114 |
```
|
|
|
115 |
|
116 |
+
### Step 2: Context Retrieval
|
117 |
+
```python
|
118 |
+
# Get STM summaries (5 recent turns)
|
119 |
+
recent_history = memory.get_recent_chat_history(user_id, num_turns=5)
|
120 |
|
121 |
+
# Get LTM semantic chunks
|
122 |
+
rag_chunks = memory.get_relevant_chunks(user_id, current_query, top_k=3)
|
123 |
|
124 |
+
# Get external RAG knowledge
|
125 |
+
external_rag = retrieve_medical_info(current_query)
|
126 |
+
```
|
127 |
|
128 |
+
### Step 3: Intelligent Context Summarization
|
129 |
+
The system sends all context sources to Gemini Flash Lite for unified summarization:
|
|
|
|
|
|
|
130 |
|
|
|
131 |
```
|
132 |
+
You are a medical assistant creating a concise summary of conversation context for continuity.
|
133 |
|
134 |
+
Current user query: "{current_query}"
|
|
|
|
|
|
|
|
|
135 |
|
136 |
+
Available context information:
|
137 |
+
Recent conversation history:
|
138 |
+
{recent_history}
|
|
|
139 |
|
140 |
+
Semantically relevant historical medical information:
|
141 |
+
{rag_chunks}
|
142 |
|
143 |
+
Task: Create a brief, coherent summary that captures the key points from the conversation history and relevant medical information that are important for understanding the current query.
|
|
|
|
|
|
|
144 |
|
145 |
+
Guidelines:
|
146 |
+
1. Focus on medical symptoms, diagnoses, treatments, or recommendations mentioned
|
147 |
+
2. Include any patient concerns or questions that are still relevant
|
148 |
+
3. Highlight any follow-up needs or pending clarifications
|
149 |
+
4. Keep the summary concise but comprehensive enough for context
|
150 |
+
5. Maintain conversational flow and continuity
|
151 |
|
152 |
+
Output: Provide a single, well-structured summary paragraph that can be used as context for the main LLM to provide a coherent response.
|
153 |
+
```
|
|
|
|
|
154 |
|
155 |
+
### Step 4: Unified Context Integration
|
156 |
+
The single, coherent summary is integrated into the main LLM prompt, providing:
|
157 |
+
- **Conversational continuity** (from STM summaries)
|
158 |
+
- **Medical knowledge** (from LTM semantic chunks)
|
159 |
+
- **Current information** (from external RAG)
|
160 |
+
- **Unified narrative** (single summary instead of multiple chunks)
|
161 |
+
|
162 |
+
## 📊 Benefits
|
163 |
+
|
164 |
+
### 1. **Advanced Memory Management**
|
165 |
+
- **STM:** Maintains 5 recent conversation summaries with intelligent deduplication
|
166 |
+
- **LTM:** Stores 60 semantic chunks (~20 rounds) with FAISS indexing
|
167 |
+
- **Smart Merging:** Combines similar content while preserving unique details
|
168 |
+
- **Topic Enrichment:** Detailed topics using user question context
|
169 |
+
|
170 |
+
### 2. **Intelligent Context Summarization**
|
171 |
+
- **Unified Summary:** Single coherent narrative instead of multiple chunks
|
172 |
+
- **Gemini Analysis:** AI-powered context selection and summarization
|
173 |
+
- **Medical Focus:** Prioritizes symptoms, diagnoses, treatments, and recommendations
|
174 |
+
- **Conversational Flow:** Maintains natural dialogue continuity
|
175 |
+
|
176 |
+
### 3. **Enhanced Chunking & Topics**
|
177 |
+
- **Question Context:** Incorporates user's latest question for richer topics
|
178 |
+
- **Detailed Topics:** 10-20 word descriptions capturing context, condition, and action
|
179 |
+
- **Medical Precision:** Includes exact medication names, doses, and clinical instructions
|
180 |
+
- **Semantic Grouping:** Organizes by medical topic, symptom, assessment, plan, or instruction
|
181 |
+
|
182 |
+
### 4. **Robust Fallback Strategy**
|
183 |
+
- **Primary:** Gemini Flash Lite contextual summarization
|
184 |
+
- **Secondary:** LTM semantic search with usage-based scoring
|
185 |
+
- **Tertiary:** STM recent summaries
|
186 |
+
- **Final:** External RAG knowledge base
|
187 |
+
|
188 |
+
### 5. **Performance & Scalability**
|
189 |
+
- **Efficient Storage:** Semantic deduplication reduces memory footprint
|
190 |
+
- **Fast Retrieval:** FAISS indexing for sub-millisecond LTM search
|
191 |
+
- **Smart Eviction:** Usage-based decay and recency scoring
|
192 |
+
- **Minimal Latency:** Optimized for real-time medical consultations
|
193 |
|
194 |
## 🧪 Example Scenarios
|
195 |
|
196 |
+
### Scenario 1: STM Deduplication & Merging
|
197 |
```
|
198 |
+
User: "I have chest pain"
|
199 |
+
Bot: "This could be angina. Symptoms include pressure, tightness, and shortness of breath."
|
200 |
|
201 |
+
User: "What about chest pain with shortness of breath?"
|
202 |
+
Bot: "Chest pain with shortness of breath is concerning for angina or heart attack..."
|
203 |
|
204 |
+
User: "Tell me more about the symptoms"
|
205 |
+
Bot: "Angina symptoms include chest pressure, tightness, shortness of breath, and may radiate to arms..."
|
206 |
+
```
|
207 |
+
**Result:** STM merges similar responses, creating a comprehensive summary: "Patient has chest pain symptoms consistent with angina, including pressure, tightness, shortness of breath, and potential radiation to arms. This represents a concerning cardiac presentation requiring immediate evaluation."
|
208 |
+
|
209 |
+
### Scenario 2: LTM Semantic Retrieval
|
210 |
+
```
|
211 |
+
User: "What medications should I avoid with my condition?"
|
212 |
+
Bot: "Based on your previous discussion about hypertension and the medications mentioned..."
|
213 |
```
|
214 |
+
**Result:** LTM retrieves relevant medical information about hypertension medications and contraindications from previous conversations, even if not in recent STM.
|
215 |
|
216 |
+
### Scenario 3: Enhanced Topic Generation
|
217 |
```
|
218 |
+
User: "I'm having trouble sleeping"
|
219 |
+
Bot: "Topic: Sleep disturbance evaluation and management for adult patient with insomnia symptoms"
|
220 |
```
|
221 |
+
**Result:** The topic incorporates the user's question context to create a detailed, medical-specific description instead of just "Sleep problems."
|
222 |
|
223 |
+
### Scenario 4: Unified Context Summarization
|
224 |
```
|
225 |
+
User: "Can you repeat the treatment plan?"
|
226 |
+
Bot: "Based on our conversation about your hypertension and sleep issues, your treatment plan includes..."
|
227 |
```
|
228 |
+
**Result:** The system creates a unified summary combining STM (recent sleep discussion), LTM (hypertension history), and RAG (current treatment guidelines) into a single coherent narrative.
|
229 |
|
230 |
## ⚙️ Configuration
|
231 |
|
|
|
234 |
FlashAPI=your_gemini_api_key # For both main LLM and contextual analysis
|
235 |
```
|
236 |
|
237 |
+
### Enhanced Memory Settings
|
238 |
```python
|
239 |
memory = MemoryManager(
|
240 |
max_users=1000, # Maximum users in memory
|
241 |
+
history_per_user=5, # STM capacity (5 recent summaries)
|
242 |
+
max_chunks=60 # LTM capacity (~20 conversational rounds)
|
243 |
)
|
244 |
```
|
245 |
|
246 |
+
### Memory Parameters
|
247 |
```python
|
248 |
+
# STM retrieval (5 recent turns)
|
249 |
+
recent_history = memory.get_recent_chat_history(user_id, num_turns=5)
|
250 |
|
251 |
+
# LTM semantic search
|
252 |
rag_chunks = memory.get_relevant_chunks(user_id, query, top_k=3, min_sim=0.30)
|
253 |
|
254 |
+
# Unified context summarization
|
255 |
+
contextual_summary = memory.get_contextual_chunks(user_id, current_query, lang)
|
256 |
+
```
|
257 |
+
|
258 |
+
### Similarity Thresholds
|
259 |
+
```python
|
260 |
+
# STM deduplication thresholds
|
261 |
+
IDENTICAL_THRESHOLD = 0.92 # Replace older with newer
|
262 |
+
MERGE_THRESHOLD = 0.75 # Merge similar content
|
263 |
+
|
264 |
+
# LTM semantic search
|
265 |
+
MIN_SIMILARITY = 0.30 # Minimum similarity for retrieval
|
266 |
+
TOP_K = 3 # Number of chunks to retrieve
|
267 |
```
|
268 |
|
269 |
## 🔍 Monitoring & Debugging
|
270 |
|
271 |
+
### Enhanced Logging
|
272 |
+
The system provides comprehensive logging for all memory operations:
|
273 |
```python
|
274 |
+
# STM operations
|
275 |
+
logger.info(f"[Contextual] Retrieved {len(recent_history)} recent history items")
|
276 |
+
logger.info(f"[Contextual] Retrieved {len(rag_chunks)} RAG chunks")
|
277 |
+
|
278 |
+
# Chunking operations
|
279 |
+
logger.info(f"[Memory] 📦 Gemini summarized chunk output: {output}")
|
280 |
+
logger.warning(f"[Memory] ❌ Gemini chunking failed: {e}")
|
281 |
+
|
282 |
+
# Contextual summarization
|
283 |
+
logger.info(f"[Contextual] Gemini created summary: {summary[:100]}...")
|
284 |
+
logger.warning(f"[Contextual] Gemini summarization failed: {e}")
|
285 |
```
|
286 |
|
287 |
### Performance Metrics
|
288 |
+
- **STM Operations:** Deduplication rate, merge frequency, topic enrichment quality
|
289 |
+
- **LTM Operations:** FAISS search latency, semantic similarity scores, eviction patterns
|
290 |
+
- **Context Summarization:** Gemini response time, summary quality, fallback usage
|
291 |
+
- **Memory Usage:** Storage efficiency, retrieval hit rates, cache performance
|
292 |
|
293 |
## 🚨 Error Handling
|
294 |
|
295 |
+
### Enhanced Fallback Strategy
|
296 |
+
1. **Primary:** Gemini Flash Lite contextual summarization
|
297 |
+
2. **Secondary:** LTM semantic search with usage-based scoring
|
298 |
+
3. **Tertiary:** STM recent summaries
|
299 |
+
4. **Final:** External RAG knowledge base
|
300 |
+
5. **Emergency:** No context (minimal response)
|
301 |
|
302 |
+
### Error Scenarios & Recovery
|
303 |
+
- **Gemini API failure** → Fall back to LTM semantic search
|
304 |
+
- **LTM corruption** → Rebuild FAISS index from remaining chunks
|
305 |
+
- **STM corruption** → Reset to empty STM, continue with LTM
|
306 |
+
- **Memory corruption** → Reset user session, clear all memory
|
307 |
+
- **Chunking failure** → Store raw response as fallback chunk
|
308 |
|
309 |
## 🔮 Future Enhancements
|
310 |
|
311 |
+
### 1. **Persistent Memory Storage**
|
312 |
+
- **Database Integration:** Store LTM in PostgreSQL/SQLite with FAISS index persistence
|
313 |
+
- **Session Recovery:** Resume conversations after system restarts
|
314 |
+
- **Memory Export:** Allow users to export their conversation history
|
315 |
+
- **Cross-device Sync:** Synchronize memory across different devices
|
316 |
+
|
317 |
+
### 2. **Advanced Memory Features**
|
318 |
+
- **Fact Store:** Dedicated storage for critical medical facts (allergies, chronic conditions, medications)
|
319 |
+
- **Memory Compression:** Summarize older STM entries into LTM when STM overflows
|
320 |
+
- **Contextual Tags:** Add metadata tags (encounter type, modality, urgency) to bias retrieval
|
321 |
+
- **Memory Analytics:** Track memory usage patterns and optimize storage strategies
|
322 |
+
|
323 |
+
### 3. **Intelligent Memory Management**
|
324 |
+
- **Adaptive Thresholds:** Dynamically adjust similarity thresholds based on conversation context
|
325 |
+
- **Memory Prioritization:** Protect critical medical information from eviction
|
326 |
+
- **Usage-based Retention:** Keep frequently accessed information longer
|
327 |
+
- **Semantic Clustering:** Group related memories for better organization
|
328 |
+
|
329 |
+
### 4. **Enhanced Medical Context**
|
330 |
+
- **Clinical Decision Support:** Integrate with medical guidelines and protocols
|
331 |
+
- **Risk Assessment:** Track and alert on potential medical risks across conversations
|
332 |
+
- **Medication Reconciliation:** Maintain accurate medication lists across sessions
|
333 |
+
- **Follow-up Scheduling:** Track recommended follow-ups and reminders
|
334 |
+
|
335 |
+
### 5. **Multi-modal Memory**
|
336 |
+
- **Image Memory:** Store and retrieve medical images with descriptions
|
337 |
+
- **Voice Memory:** Convert voice interactions to text for memory storage
|
338 |
+
- **Document Memory:** Process and store medical documents and reports
|
339 |
+
- **Temporal Memory:** Track changes in symptoms and conditions over time
|
340 |
|
341 |
## 📝 Testing
|
342 |
|
343 |
+
### Memory System Testing
|
344 |
```bash
|
345 |
cd Medical-Chatbot
|
346 |
+
python test_memory_system.py
|
347 |
```
|
348 |
|
349 |
+
### Test Scenarios
|
350 |
+
1. **STM Deduplication Test:** Verify similar responses are merged correctly
|
351 |
+
2. **LTM Semantic Search Test:** Test FAISS retrieval with various queries
|
352 |
+
3. **Context Summarization Test:** Validate unified summary generation
|
353 |
+
4. **Topic Enrichment Test:** Check detailed topic generation with question context
|
354 |
+
5. **Memory Capacity Test:** Verify STM (5 items) and LTM (60 items) limits
|
355 |
+
6. **Fallback Strategy Test:** Test system behavior when Gemini API fails
|
356 |
+
|
357 |
+
### Expected Behaviors
|
358 |
+
- **STM:** Similar responses merge, unique details preserved
|
359 |
+
- **LTM:** Semantic search returns relevant chunks with usage tracking
|
360 |
+
- **Topics:** Detailed, medical-specific descriptions (10-20 words)
|
361 |
+
- **Summaries:** Coherent narratives combining STM + LTM + RAG
|
362 |
+
- **Performance:** Sub-second retrieval times for all operations
|
363 |
|
364 |
## 🎯 Summary
|
365 |
|
366 |
+
The enhanced memory system transforms the Medical Chatbot into a sophisticated, memory-aware medical assistant that:
|
367 |
+
|
368 |
+
✅ **Maintains Short-Term Memory (STM)** with 5 recent conversation summaries and intelligent deduplication
|
369 |
+
✅ **Provides Long-Term Memory (LTM)** with 60 semantic chunks and FAISS-based retrieval
|
370 |
+
✅ **Generates Enhanced Topics** using question context for detailed, medical-specific descriptions
|
371 |
+
✅ **Creates Unified Summaries** combining STM + LTM + RAG into coherent narratives
|
372 |
+
✅ **Implements Smart Merging** that preserves unique details while eliminating redundancy
|
373 |
+
✅ **Ensures Conversational Continuity** across extended medical consultations
|
374 |
+
✅ **Optimizes Performance** with sub-second retrieval and efficient memory management
|
375 |
|
376 |
+
This advanced memory system addresses the limitations of simple RAG systems by providing:
|
377 |
+
- **Intelligent context management** that remembers and builds upon previous interactions
|
378 |
+
- **Medical precision** with detailed topics and exact clinical information
|
379 |
+
- **Scalable architecture** that can handle extended conversations without performance degradation
|
380 |
+
- **Robust fallback strategies** ensuring system reliability in all scenarios
|
381 |
|
382 |
+
The result is a medical chatbot that truly understands conversation context, remembers patient history, and provides increasingly relevant and personalized medical guidance over time.
|