SaritMeshesha commited on
Commit
3e090a6
Β·
verified Β·
1 Parent(s): 186205a

Upload 2 files

Browse files
Files changed (2) hide show
  1. README.md +309 -27
  2. app.py +2 -2
README.md CHANGED
@@ -1,46 +1,328 @@
1
  ---
2
- title: LangGraph Data Analyst Agent (Debug)
3
- emoji: πŸ”§
4
  colorFrom: blue
5
  colorTo: purple
6
  sdk: streamlit
7
  sdk_version: "1.28.0"
8
- app_file: debug_app.py
9
  pinned: false
10
  license: mit
11
  ---
12
 
13
- # πŸ”§ LangGraph Data Analyst Agent - Debug Mode
14
 
15
- **Temporary debug version to diagnose deployment issues**
16
 
17
- This debug tool will help identify what's causing the "thinking" hang in your deployment.
18
 
19
- ## πŸš€ Quick Steps:
 
 
 
20
 
21
- 1. **Upload `debug_app.py` to your Space**
22
- 2. **Replace your README.md with this version**
23
- 3. **Wait for Space to restart**
24
- 4. **Run the debug tests**
25
- 5. **Check the results and error messages**
26
 
27
- ## πŸ” What This Debug Tool Checks:
 
 
 
28
 
29
- - βœ… Python environment and packages
30
- - βœ… API key configuration
31
- - βœ… LangGraph agent import
32
- - βœ… Dataset loading
33
- - βœ… Simple agent test
34
- - βœ… Error details and stack traces
35
 
36
- ## πŸ“‹ Expected Results:
37
 
38
- The debug tool will show you exactly where the problem is:
39
- - Import errors
40
- - API key issues
41
- - Network connectivity problems
42
- - LangGraph workflow errors
43
 
44
- ## πŸ”§ After Debugging:
 
 
 
 
45
 
46
- Once you identify the issue, switch back to the main app by updating README.md to use `app_file: app.py`
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ title: LangGraph Data Analyst Agent
3
+ emoji: πŸ€–
4
  colorFrom: blue
5
  colorTo: purple
6
  sdk: streamlit
7
  sdk_version: "1.28.0"
8
+ app_file: app.py
9
  pinned: false
10
  license: mit
11
  ---
12
 
13
+ # πŸ€– LangGraph Data Analyst Agent
14
 
15
+ An intelligent data analyst agent built with LangGraph that analyzes customer support conversations with advanced memory, conversation persistence, and query recommendations.
16
 
17
+ ## 🌟 Features
18
 
19
+ ### Core Functionality
20
+ - **Multi-Agent Architecture**: Separate specialized agents for structured and unstructured queries
21
+ - **Query Classification**: Automatic routing to appropriate agent based on query type
22
+ - **Rich Tool Set**: Comprehensive tools for data analysis and insights
23
 
24
+ ### Advanced Memory & Persistence
25
+ - **Session Management**: Persistent conversations across page reloads and browser sessions
26
+ - **User Profile Tracking**: Agent learns and remembers user interests and preferences
27
+ - **Conversation History**: Full context retention using LangGraph checkpointers
28
+ - **Cross-Session Continuity**: Resume conversations using session IDs
29
 
30
+ ### Intelligent Recommendations
31
+ - **Query Suggestions**: AI-powered recommendations based on conversation history
32
+ - **Interactive Refinement**: Collaborative query building with the agent
33
+ - **Context-Aware**: Suggestions based on user profile and previous interactions
34
 
35
+ ## πŸ—οΈ Architecture
 
 
 
 
 
36
 
37
+ The agent uses LangGraph's multi-agent architecture with the following components:
38
 
39
+ ```
40
+ User Query β†’ Classifier β†’ [Structured Agent | Unstructured Agent | Recommender] β†’ Summarizer β†’ Response
41
+ ↓
42
+ Tool Nodes (Dataset Analysis Tools)
43
+ ```
44
 
45
+ ### Agent Types
46
+ 1. **Structured Agent**: Handles quantitative queries (statistics, examples, distributions)
47
+ 2. **Unstructured Agent**: Handles qualitative queries (summaries, insights, patterns)
48
+ 3. **Query Recommender**: Suggests follow-up questions based on context
49
+ 4. **Summarizer**: Updates user profile and conversation memory
50
 
51
+ ## πŸš€ Setup Instructions
52
+
53
+ ### Prerequisites
54
+ - **Python Version**: 3.9 or higher
55
+ - **API Key**: OpenAI API key or Nebius API key
56
+ - **For Hugging Face Spaces**: Ensure your API key is set as a Space secret
57
+
58
+ ### Installation
59
+
60
+ 1. **Clone the repository**:
61
+ ```bash
62
+ git clone <repository-url>
63
+ cd Agents
64
+ ```
65
+
66
+ 2. **Install dependencies**:
67
+ ```bash
68
+ pip install -r requirements.txt
69
+ ```
70
+
71
+ 3. **Configure API Key**:
72
+
73
+ Create a `.env` file in the project root:
74
+ ```bash
75
+ # For OpenAI (recommended)
76
+ OPENAI_API_KEY=your_openai_api_key_here
77
+
78
+ # OR for Nebius
79
+ NEBIUS_API_KEY=your_nebius_api_key_here
80
+ ```
81
+
82
+ 4. **Run the application**:
83
+ ```bash
84
+ streamlit run app.py
85
+ ```
86
+
87
+ 5. **Access the app**:
88
+ Open your browser to `http://localhost:8501`
89
+
90
+ ### Alternative Deployment
91
+
92
+ #### For Hugging Face Spaces:
93
+ 1. **Fork or upload this repository to Hugging Face Spaces**
94
+ 2. **Set your API key as a Space secret:**
95
+ - Go to your Space settings
96
+ - Navigate to "Variables and secrets"
97
+ - Add a secret named `NEBIUS_API_KEY` or `OPENAI_API_KEY`
98
+ - Enter your API key as the value
99
+ 3. **The app will start automatically**
100
+
101
+ #### For other cloud deployment:
102
+ ```bash
103
+ export OPENAI_API_KEY=your_api_key_here
104
+ # OR
105
+ export NEBIUS_API_KEY=your_api_key_here
106
+ ```
107
+
108
+ ## 🎯 Usage Guide
109
+
110
+ ### Query Types
111
+
112
+ #### Structured Queries (Quantitative Analysis)
113
+ - "How many records are in each category?"
114
+ - "What are the most common customer issues?"
115
+ - "Show me 5 examples of billing problems"
116
+ - "Get distribution of intents"
117
+
118
+ #### Unstructured Queries (Qualitative Analysis)
119
+ - "Summarize the refund category"
120
+ - "What patterns do you see in payment issues?"
121
+ - "Analyze customer sentiment in billing conversations"
122
+ - "What insights can you provide about technical support?"
123
+
124
+ #### Memory & Recommendations
125
+ - "What do you remember about me?"
126
+ - "What should I query next?"
127
+ - "Advise me what to explore"
128
+ - "Recommend follow-up questions"
129
+
130
+ ### Session Management
131
+
132
+ #### Creating Sessions
133
+ - **New Session**: Click "πŸ†• New Session" to start fresh
134
+ - **Auto-Generated**: Each new browser session gets a unique ID
135
+
136
+ #### Resuming Sessions
137
+ 1. Copy your session ID from the sidebar (e.g., `a1b2c3d4...`)
138
+ 2. Enter the full session ID in "Join Existing Session"
139
+ 3. Click "πŸ”— Join Session" to resume
140
+
141
+ #### Cross-Tab Persistence
142
+ - Open multiple tabs with the same session ID
143
+ - Conversations sync across all tabs
144
+ - Memory and user profile persist
145
+
146
+ ## 🧠 Memory System
147
+
148
+ ### User Profile Tracking
149
+ The agent automatically tracks:
150
+ - **Interests**: Topics and categories you frequently ask about
151
+ - **Expertise Level**: Inferred from question complexity (beginner/intermediate/advanced)
152
+ - **Preferences**: Analysis style preferences (quantitative vs qualitative)
153
+ - **Query History**: Recent questions for context
154
+
155
+ ### Conversation Persistence
156
+ - **Thread-based**: Each session has a unique thread ID
157
+ - **Checkpoint System**: LangGraph automatically saves state after each interaction
158
+ - **Cross-Session**: Resume conversations days or weeks later
159
+
160
+ ### Memory Queries
161
+ Ask the agent what it remembers:
162
+ ```
163
+ "What do you remember about me?"
164
+ "What are my interests?"
165
+ "What have I asked about before?"
166
+ ```
167
+
168
+ ## πŸ”§ Testing the Agent
169
+
170
+ ### Basic Functionality Tests
171
+
172
+ 1. **Classification Test**:
173
+ ```
174
+ Query: "How many categories are there?"
175
+ Expected: Routes to Structured Agent β†’ Uses get_dataset_stats tool
176
+ ```
177
+
178
+ 2. **Follow-up Memory Test**:
179
+ ```
180
+ Query 1: "Show me billing examples"
181
+ Query 2: "Show me more examples"
182
+ Expected: Agent remembers previous context about billing
183
+ ```
184
+
185
+ 3. **User Profile Test**:
186
+ ```
187
+ Query 1: "I'm interested in refund patterns"
188
+ Query 2: "What do you remember about me?"
189
+ Expected: Agent mentions interest in refunds
190
+ ```
191
+
192
+ 4. **Recommendation Test**:
193
+ ```
194
+ Query: "What should I query next?"
195
+ Expected: Personalized suggestions based on history
196
+ ```
197
+
198
+ ### Advanced Feature Tests
199
+
200
+ 1. **Session Persistence**:
201
+ - Ask a question, reload the page
202
+ - Verify conversation history remains
203
+ - Verify user profile persists
204
+
205
+ 2. **Cross-Session Memory**:
206
+ - Note your session ID
207
+ - Close browser completely
208
+ - Reopen and join the same session
209
+ - Verify full conversation and profile restoration
210
+
211
+ 3. **Interactive Recommendations**:
212
+ ```
213
+ User: "Advise me what to query next"
214
+ Agent: "Based on your interest in billing, you might want to analyze refund patterns."
215
+ User: "I'd rather see examples instead"
216
+ Agent: "Then I suggest showing 5 examples of refund requests."
217
+ User: "Please do so"
218
+ Expected: Agent executes the refined query
219
+ ```
220
+
221
+ ## πŸ“ File Structure
222
+
223
+ ```
224
+ Agents/
225
+ β”œβ”€β”€ README.md # This file
226
+ β”œβ”€β”€ requirements.txt # Python dependencies
227
+ β”œβ”€β”€ .env # API keys (create this)
228
+ β”œβ”€β”€ app.py # LangGraph Streamlit app
229
+ β”œβ”€β”€ langgraph_agent.py # LangGraph agent implementation
230
+ β”œβ”€β”€ agent-memory.ipynb # Memory example notebook
231
+ β”œβ”€β”€ test_agent.py # Test suite
232
+ └── DEPLOYMENT_GUIDE.md # Original deployment guide
233
+ ```
234
+
235
+ ## πŸ› οΈ Technical Implementation
236
+
237
+ ### LangGraph Components
238
+
239
+ **State Management**:
240
+ ```python
241
+ class AgentState(TypedDict):
242
+ messages: List[Any]
243
+ query_type: Optional[str]
244
+ user_profile: Optional[Dict[str, Any]]
245
+ session_context: Optional[Dict[str, Any]]
246
+ ```
247
+
248
+ **Tool Categories**:
249
+ - **Structured Tools**: Statistics, distributions, examples, search
250
+ - **Unstructured Tools**: Summaries, insights, pattern analysis
251
+ - **Memory Tools**: Profile updates, preference tracking
252
+
253
+ **Graph Flow**:
254
+ 1. **Classifier**: Determines query type
255
+ 2. **Agent Selection**: Routes to appropriate specialist
256
+ 3. **Tool Execution**: Dynamic tool usage based on needs
257
+ 4. **Memory Update**: Profile and context updates
258
+ 5. **Response Generation**: Final answer with memory integration
259
+
260
+ ### Memory Architecture
261
+
262
+ **Checkpointer**: LangGraph's `MemorySaver` for conversation persistence
263
+ **Thread Management**: Unique thread IDs for session isolation
264
+ **Profile Synthesis**: LLM-powered extraction of user characteristics
265
+ **Context Retention**: Full conversation history with temporal awareness
266
+
267
+ ## πŸ” Troubleshooting
268
+
269
+ ### Common Issues
270
+
271
+ 1. **API Key Errors**:
272
+ - Verify `.env` file exists and has correct key
273
+ - Check environment variable is set in deployment
274
+ - Ensure API key has sufficient credits
275
+
276
+ 2. **Memory Not Persisting**:
277
+ - Verify session ID remains consistent
278
+ - Check browser localStorage not being cleared
279
+ - Ensure thread_id parameter is passed correctly
280
+
281
+ 3. **Dataset Loading Issues**:
282
+ - Check internet connection for Hugging Face datasets
283
+ - Verify datasets library is installed
284
+ - Try clearing Streamlit cache: `streamlit cache clear`
285
+
286
+ 4. **Tool Execution Errors**:
287
+ - Verify all dependencies in requirements.txt are installed
288
+ - Check dataset is properly loaded
289
+ - Review error messages in Streamlit interface
290
+
291
+ ### Debug Mode
292
+
293
+ Enable debug logging by setting:
294
+ ```python
295
+ import logging
296
+ logging.basicConfig(level=logging.DEBUG)
297
+ ```
298
+
299
+ ## πŸŽ“ Learning Objectives
300
+
301
+ This implementation demonstrates:
302
+
303
+ 1. **LangGraph Multi-Agent Systems**: Specialized agents for different query types
304
+ 2. **Memory & Persistence**: Conversation continuity across sessions
305
+ 3. **Tool Integration**: Dynamic tool selection and execution
306
+ 4. **State Management**: Complex state updates and routing
307
+ 5. **User Experience**: Session management and interactive features
308
+
309
+ ## πŸš€ Future Enhancements
310
+
311
+ Potential improvements:
312
+ - **Database Persistence**: Replace MemorySaver with PostgreSQL checkpointer
313
+ - **Advanced Analytics**: More sophisticated data analysis tools
314
+ - **Export Features**: PDF/CSV report generation
315
+ - **User Authentication**: Multi-user support with profiles
316
+ - **Real-time Collaboration**: Shared sessions between users
317
+
318
+ ## πŸ“„ License
319
+
320
+ This project is for educational purposes as part of a data science curriculum.
321
+
322
+ ## 🀝 Contributing
323
+
324
+ This is an assignment project. For questions or issues, please contact the course instructors.
325
+
326
+ ---
327
+
328
+ **Built with**: LangGraph, Streamlit, OpenAI/Nebius, Hugging Face Datasets
app.py CHANGED
@@ -2,14 +2,14 @@ import json
2
  import os
3
  import uuid
4
  from datetime import datetime
5
- from typing import Dict, List, Optional
6
 
7
  import pandas as pd
8
  import streamlit as st
9
  from datasets import load_dataset
10
  from dotenv import load_dotenv
11
 
12
- from langgraph_agent import DataAnalystAgent, DatasetManager
13
 
14
  # Load environment variables
15
  load_dotenv()
 
2
  import os
3
  import uuid
4
  from datetime import datetime
5
+ from typing import Dict
6
 
7
  import pandas as pd
8
  import streamlit as st
9
  from datasets import load_dataset
10
  from dotenv import load_dotenv
11
 
12
+ from langgraph_agent import DataAnalystAgent
13
 
14
  # Load environment variables
15
  load_dotenv()