SaritMeshesha commited on
Commit
186205a
Β·
verified Β·
1 Parent(s): 6bcc4cb

Upload 2 files

Browse files
Files changed (2) hide show
  1. README.md +27 -309
  2. debug_app.py +187 -0
README.md CHANGED
@@ -1,328 +1,46 @@
1
  ---
2
- title: LangGraph Data Analyst Agent
3
- emoji: πŸ€–
4
  colorFrom: blue
5
  colorTo: purple
6
  sdk: streamlit
7
  sdk_version: "1.28.0"
8
- app_file: app.py
9
  pinned: false
10
  license: mit
11
  ---
12
 
13
- # πŸ€– LangGraph Data Analyst Agent
14
 
15
- An intelligent data analyst agent built with LangGraph that analyzes customer support conversations with advanced memory, conversation persistence, and query recommendations.
16
 
17
- ## 🌟 Features
18
 
19
- ### Core Functionality
20
- - **Multi-Agent Architecture**: Separate specialized agents for structured and unstructured queries
21
- - **Query Classification**: Automatic routing to appropriate agent based on query type
22
- - **Rich Tool Set**: Comprehensive tools for data analysis and insights
23
 
24
- ### Advanced Memory & Persistence
25
- - **Session Management**: Persistent conversations across page reloads and browser sessions
26
- - **User Profile Tracking**: Agent learns and remembers user interests and preferences
27
- - **Conversation History**: Full context retention using LangGraph checkpointers
28
- - **Cross-Session Continuity**: Resume conversations using session IDs
29
 
30
- ### Intelligent Recommendations
31
- - **Query Suggestions**: AI-powered recommendations based on conversation history
32
- - **Interactive Refinement**: Collaborative query building with the agent
33
- - **Context-Aware**: Suggestions based on user profile and previous interactions
34
 
35
- ## πŸ—οΈ Architecture
 
 
 
 
 
36
 
37
- The agent uses LangGraph's multi-agent architecture with the following components:
38
 
39
- ```
40
- User Query β†’ Classifier β†’ [Structured Agent | Unstructured Agent | Recommender] β†’ Summarizer β†’ Response
41
- ↓
42
- Tool Nodes (Dataset Analysis Tools)
43
- ```
44
 
45
- ### Agent Types
46
- 1. **Structured Agent**: Handles quantitative queries (statistics, examples, distributions)
47
- 2. **Unstructured Agent**: Handles qualitative queries (summaries, insights, patterns)
48
- 3. **Query Recommender**: Suggests follow-up questions based on context
49
- 4. **Summarizer**: Updates user profile and conversation memory
50
 
51
- ## πŸš€ Setup Instructions
52
-
53
- ### Prerequisites
54
- - **Python Version**: 3.9 or higher
55
- - **API Key**: OpenAI API key or Nebius API key
56
- - **For Hugging Face Spaces**: Ensure your API key is set as a Space secret
57
-
58
- ### Installation
59
-
60
- 1. **Clone the repository**:
61
- ```bash
62
- git clone <repository-url>
63
- cd Agents
64
- ```
65
-
66
- 2. **Install dependencies**:
67
- ```bash
68
- pip install -r requirements.txt
69
- ```
70
-
71
- 3. **Configure API Key**:
72
-
73
- Create a `.env` file in the project root:
74
- ```bash
75
- # For OpenAI (recommended)
76
- OPENAI_API_KEY=your_openai_api_key_here
77
-
78
- # OR for Nebius
79
- NEBIUS_API_KEY=your_nebius_api_key_here
80
- ```
81
-
82
- 4. **Run the application**:
83
- ```bash
84
- streamlit run app.py
85
- ```
86
-
87
- 5. **Access the app**:
88
- Open your browser to `http://localhost:8501`
89
-
90
- ### Alternative Deployment
91
-
92
- #### For Hugging Face Spaces:
93
- 1. **Fork or upload this repository to Hugging Face Spaces**
94
- 2. **Set your API key as a Space secret:**
95
- - Go to your Space settings
96
- - Navigate to "Variables and secrets"
97
- - Add a secret named `NEBIUS_API_KEY` or `OPENAI_API_KEY`
98
- - Enter your API key as the value
99
- 3. **The app will start automatically**
100
-
101
- #### For other cloud deployment:
102
- ```bash
103
- export OPENAI_API_KEY=your_api_key_here
104
- # OR
105
- export NEBIUS_API_KEY=your_api_key_here
106
- ```
107
-
108
- ## 🎯 Usage Guide
109
-
110
- ### Query Types
111
-
112
- #### Structured Queries (Quantitative Analysis)
113
- - "How many records are in each category?"
114
- - "What are the most common customer issues?"
115
- - "Show me 5 examples of billing problems"
116
- - "Get distribution of intents"
117
-
118
- #### Unstructured Queries (Qualitative Analysis)
119
- - "Summarize the refund category"
120
- - "What patterns do you see in payment issues?"
121
- - "Analyze customer sentiment in billing conversations"
122
- - "What insights can you provide about technical support?"
123
-
124
- #### Memory & Recommendations
125
- - "What do you remember about me?"
126
- - "What should I query next?"
127
- - "Advise me what to explore"
128
- - "Recommend follow-up questions"
129
-
130
- ### Session Management
131
-
132
- #### Creating Sessions
133
- - **New Session**: Click "πŸ†• New Session" to start fresh
134
- - **Auto-Generated**: Each new browser session gets a unique ID
135
-
136
- #### Resuming Sessions
137
- 1. Copy your session ID from the sidebar (e.g., `a1b2c3d4...`)
138
- 2. Enter the full session ID in "Join Existing Session"
139
- 3. Click "πŸ”— Join Session" to resume
140
-
141
- #### Cross-Tab Persistence
142
- - Open multiple tabs with the same session ID
143
- - Conversations sync across all tabs
144
- - Memory and user profile persist
145
-
146
- ## 🧠 Memory System
147
-
148
- ### User Profile Tracking
149
- The agent automatically tracks:
150
- - **Interests**: Topics and categories you frequently ask about
151
- - **Expertise Level**: Inferred from question complexity (beginner/intermediate/advanced)
152
- - **Preferences**: Analysis style preferences (quantitative vs qualitative)
153
- - **Query History**: Recent questions for context
154
-
155
- ### Conversation Persistence
156
- - **Thread-based**: Each session has a unique thread ID
157
- - **Checkpoint System**: LangGraph automatically saves state after each interaction
158
- - **Cross-Session**: Resume conversations days or weeks later
159
-
160
- ### Memory Queries
161
- Ask the agent what it remembers:
162
- ```
163
- "What do you remember about me?"
164
- "What are my interests?"
165
- "What have I asked about before?"
166
- ```
167
-
168
- ## πŸ”§ Testing the Agent
169
-
170
- ### Basic Functionality Tests
171
-
172
- 1. **Classification Test**:
173
- ```
174
- Query: "How many categories are there?"
175
- Expected: Routes to Structured Agent β†’ Uses get_dataset_stats tool
176
- ```
177
-
178
- 2. **Follow-up Memory Test**:
179
- ```
180
- Query 1: "Show me billing examples"
181
- Query 2: "Show me more examples"
182
- Expected: Agent remembers previous context about billing
183
- ```
184
-
185
- 3. **User Profile Test**:
186
- ```
187
- Query 1: "I'm interested in refund patterns"
188
- Query 2: "What do you remember about me?"
189
- Expected: Agent mentions interest in refunds
190
- ```
191
-
192
- 4. **Recommendation Test**:
193
- ```
194
- Query: "What should I query next?"
195
- Expected: Personalized suggestions based on history
196
- ```
197
-
198
- ### Advanced Feature Tests
199
-
200
- 1. **Session Persistence**:
201
- - Ask a question, reload the page
202
- - Verify conversation history remains
203
- - Verify user profile persists
204
-
205
- 2. **Cross-Session Memory**:
206
- - Note your session ID
207
- - Close browser completely
208
- - Reopen and join the same session
209
- - Verify full conversation and profile restoration
210
-
211
- 3. **Interactive Recommendations**:
212
- ```
213
- User: "Advise me what to query next"
214
- Agent: "Based on your interest in billing, you might want to analyze refund patterns."
215
- User: "I'd rather see examples instead"
216
- Agent: "Then I suggest showing 5 examples of refund requests."
217
- User: "Please do so"
218
- Expected: Agent executes the refined query
219
- ```
220
-
221
- ## πŸ“ File Structure
222
-
223
- ```
224
- Agents/
225
- β”œβ”€β”€ README.md # This file
226
- β”œβ”€β”€ requirements.txt # Python dependencies
227
- β”œβ”€β”€ .env # API keys (create this)
228
- β”œβ”€β”€ app.py # LangGraph Streamlit app
229
- β”œβ”€β”€ langgraph_agent.py # LangGraph agent implementation
230
- β”œβ”€β”€ agent-memory.ipynb # Memory example notebook
231
- β”œβ”€β”€ test_agent.py # Test suite
232
- └── DEPLOYMENT_GUIDE.md # Original deployment guide
233
- ```
234
-
235
- ## πŸ› οΈ Technical Implementation
236
-
237
- ### LangGraph Components
238
-
239
- **State Management**:
240
- ```python
241
- class AgentState(TypedDict):
242
- messages: List[Any]
243
- query_type: Optional[str]
244
- user_profile: Optional[Dict[str, Any]]
245
- session_context: Optional[Dict[str, Any]]
246
- ```
247
-
248
- **Tool Categories**:
249
- - **Structured Tools**: Statistics, distributions, examples, search
250
- - **Unstructured Tools**: Summaries, insights, pattern analysis
251
- - **Memory Tools**: Profile updates, preference tracking
252
-
253
- **Graph Flow**:
254
- 1. **Classifier**: Determines query type
255
- 2. **Agent Selection**: Routes to appropriate specialist
256
- 3. **Tool Execution**: Dynamic tool usage based on needs
257
- 4. **Memory Update**: Profile and context updates
258
- 5. **Response Generation**: Final answer with memory integration
259
-
260
- ### Memory Architecture
261
-
262
- **Checkpointer**: LangGraph's `MemorySaver` for conversation persistence
263
- **Thread Management**: Unique thread IDs for session isolation
264
- **Profile Synthesis**: LLM-powered extraction of user characteristics
265
- **Context Retention**: Full conversation history with temporal awareness
266
-
267
- ## πŸ” Troubleshooting
268
-
269
- ### Common Issues
270
-
271
- 1. **API Key Errors**:
272
- - Verify `.env` file exists and has correct key
273
- - Check environment variable is set in deployment
274
- - Ensure API key has sufficient credits
275
-
276
- 2. **Memory Not Persisting**:
277
- - Verify session ID remains consistent
278
- - Check browser localStorage not being cleared
279
- - Ensure thread_id parameter is passed correctly
280
-
281
- 3. **Dataset Loading Issues**:
282
- - Check internet connection for Hugging Face datasets
283
- - Verify datasets library is installed
284
- - Try clearing Streamlit cache: `streamlit cache clear`
285
-
286
- 4. **Tool Execution Errors**:
287
- - Verify all dependencies in requirements.txt are installed
288
- - Check dataset is properly loaded
289
- - Review error messages in Streamlit interface
290
-
291
- ### Debug Mode
292
-
293
- Enable debug logging by setting:
294
- ```python
295
- import logging
296
- logging.basicConfig(level=logging.DEBUG)
297
- ```
298
-
299
- ## πŸŽ“ Learning Objectives
300
-
301
- This implementation demonstrates:
302
-
303
- 1. **LangGraph Multi-Agent Systems**: Specialized agents for different query types
304
- 2. **Memory & Persistence**: Conversation continuity across sessions
305
- 3. **Tool Integration**: Dynamic tool selection and execution
306
- 4. **State Management**: Complex state updates and routing
307
- 5. **User Experience**: Session management and interactive features
308
-
309
- ## πŸš€ Future Enhancements
310
-
311
- Potential improvements:
312
- - **Database Persistence**: Replace MemorySaver with PostgreSQL checkpointer
313
- - **Advanced Analytics**: More sophisticated data analysis tools
314
- - **Export Features**: PDF/CSV report generation
315
- - **User Authentication**: Multi-user support with profiles
316
- - **Real-time Collaboration**: Shared sessions between users
317
-
318
- ## πŸ“„ License
319
-
320
- This project is for educational purposes as part of a data science curriculum.
321
-
322
- ## 🀝 Contributing
323
-
324
- This is an assignment project. For questions or issues, please contact the course instructors.
325
-
326
- ---
327
-
328
- **Built with**: LangGraph, Streamlit, OpenAI/Nebius, Hugging Face Datasets
 
1
  ---
2
+ title: LangGraph Data Analyst Agent (Debug)
3
+ emoji: πŸ”§
4
  colorFrom: blue
5
  colorTo: purple
6
  sdk: streamlit
7
  sdk_version: "1.28.0"
8
+ app_file: debug_app.py
9
  pinned: false
10
  license: mit
11
  ---
12
 
13
+ # πŸ”§ LangGraph Data Analyst Agent - Debug Mode
14
 
15
+ **Temporary debug version to diagnose deployment issues**
16
 
17
+ This debug tool will help identify what's causing the "thinking" hang in your deployment.
18
 
19
+ ## πŸš€ Quick Steps:
 
 
 
20
 
21
+ 1. **Upload `debug_app.py` to your Space**
22
+ 2. **Replace your README.md with this version**
23
+ 3. **Wait for Space to restart**
24
+ 4. **Run the debug tests**
25
+ 5. **Check the results and error messages**
26
 
27
+ ## πŸ” What This Debug Tool Checks:
 
 
 
28
 
29
+ - βœ… Python environment and packages
30
+ - βœ… API key configuration
31
+ - βœ… LangGraph agent import
32
+ - βœ… Dataset loading
33
+ - βœ… Simple agent test
34
+ - βœ… Error details and stack traces
35
 
36
+ ## πŸ“‹ Expected Results:
37
 
38
+ The debug tool will show you exactly where the problem is:
39
+ - Import errors
40
+ - API key issues
41
+ - Network connectivity problems
42
+ - LangGraph workflow errors
43
 
44
+ ## πŸ”§ After Debugging:
 
 
 
 
45
 
46
+ Once you identify the issue, switch back to the main app by updating README.md to use `app_file: app.py`
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
debug_app.py ADDED
@@ -0,0 +1,187 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import json
2
+ import os
3
+ import traceback
4
+ import uuid
5
+ from datetime import datetime
6
+ from typing import Dict
7
+
8
+ import pandas as pd
9
+ import streamlit as st
10
+ from datasets import load_dataset
11
+ from dotenv import load_dotenv
12
+
13
+ # Only import if file exists
14
+ try:
15
+ from langgraph_agent import DataAnalystAgent
16
+
17
+ AGENT_AVAILABLE = True
18
+ except ImportError as e:
19
+ AGENT_AVAILABLE = False
20
+ IMPORT_ERROR = str(e)
21
+
22
+ # Load environment variables
23
+ load_dotenv()
24
+
25
+ # Set up page config
26
+ st.set_page_config(
27
+ page_title="πŸ€– LangGraph Data Analyst Agent (Debug)",
28
+ layout="wide",
29
+ page_icon="πŸ€–",
30
+ initial_sidebar_state="expanded",
31
+ )
32
+
33
+
34
+ def check_environment():
35
+ """Check the deployment environment and dependencies."""
36
+ st.markdown("## πŸ” Environment Debug Info")
37
+
38
+ # Check Python version
39
+ import sys
40
+
41
+ st.write(f"**Python Version:** {sys.version}")
42
+
43
+ # Check if running on Hugging Face
44
+ is_hf_space = os.environ.get("SPACE_ID") is not None
45
+ st.write(f"**Running on Hugging Face Spaces:** {is_hf_space}")
46
+ if is_hf_space:
47
+ st.write(f"**Space ID:** {os.environ.get('SPACE_ID', 'Unknown')}")
48
+
49
+ # Check API key availability
50
+ nebius_key = os.environ.get("NEBIUS_API_KEY")
51
+ openai_key = os.environ.get("OPENAI_API_KEY")
52
+ st.write(f"**Nebius API Key Available:** {'Yes' if nebius_key else 'No'}")
53
+ st.write(f"**OpenAI API Key Available:** {'Yes' if openai_key else 'No'}")
54
+
55
+ if nebius_key:
56
+ st.write(f"**Nebius Key Length:** {len(nebius_key)} characters")
57
+ if openai_key:
58
+ st.write(f"**OpenAI Key Length:** {len(openai_key)} characters")
59
+
60
+ # Check agent import
61
+ st.write(
62
+ f"**LangGraph Agent Import:** {'βœ… Success' if AGENT_AVAILABLE else '❌ Failed'}"
63
+ )
64
+ if not AGENT_AVAILABLE:
65
+ st.error(f"Import Error: {IMPORT_ERROR}")
66
+
67
+ # Check required packages
68
+ required_packages = [
69
+ "langchain",
70
+ "langchain_core",
71
+ "langchain_openai",
72
+ "langgraph",
73
+ "datasets",
74
+ "pandas",
75
+ ]
76
+
77
+ st.markdown("### πŸ“¦ Package Availability")
78
+ for package in required_packages:
79
+ try:
80
+ __import__(package)
81
+ st.write(f"βœ… {package}")
82
+ except ImportError as e:
83
+ st.write(f"❌ {package} - {str(e)}")
84
+
85
+
86
+ def test_simple_agent():
87
+ """Test basic agent functionality."""
88
+ if not AGENT_AVAILABLE:
89
+ st.error("Cannot test agent - import failed")
90
+ return
91
+
92
+ st.markdown("## πŸ§ͺ Agent Test")
93
+
94
+ # Get API key
95
+ api_key = os.environ.get("NEBIUS_API_KEY") or os.environ.get("OPENAI_API_KEY")
96
+ if not api_key:
97
+ st.error("No API key found!")
98
+ return
99
+
100
+ st.write("**API Key:** βœ… Available")
101
+
102
+ # Test agent creation
103
+ try:
104
+ st.write("**Creating Agent...**")
105
+ agent = DataAnalystAgent(api_key=api_key)
106
+ st.write("βœ… Agent created successfully")
107
+
108
+ # Test simple query
109
+ if st.button("πŸ§ͺ Test Simple Query"):
110
+ with st.spinner("Testing agent with simple query..."):
111
+ try:
112
+ result = agent.invoke("Hello, are you working?", "debug_test")
113
+ st.success("βœ… Agent responded successfully!")
114
+
115
+ st.markdown("**Response Messages:**")
116
+ for i, msg in enumerate(result.get("messages", [])):
117
+ st.write(
118
+ f"{i+1}. {type(msg).__name__}: {getattr(msg, 'content', 'No content')[:100]}..."
119
+ )
120
+
121
+ except Exception as e:
122
+ st.error(f"❌ Agent test failed: {str(e)}")
123
+ st.code(traceback.format_exc())
124
+
125
+ except Exception as e:
126
+ st.error(f"❌ Agent creation failed: {str(e)}")
127
+ st.code(traceback.format_exc())
128
+
129
+
130
+ def test_dataset_loading():
131
+ """Test dataset loading."""
132
+ st.markdown("## πŸ“Š Dataset Test")
133
+
134
+ try:
135
+ with st.spinner("Loading dataset..."):
136
+ dataset = load_dataset(
137
+ "bitext/Bitext-customer-support-llm-chatbot-training-dataset"
138
+ )
139
+ df = pd.DataFrame(dataset["train"])
140
+ st.success(f"βœ… Dataset loaded: {len(df):,} records")
141
+ st.dataframe(df.head(3))
142
+ except Exception as e:
143
+ st.error(f"❌ Dataset loading failed: {str(e)}")
144
+ st.code(traceback.format_exc())
145
+
146
+
147
+ def main():
148
+ st.title("πŸ”§ LangGraph Agent Debug Tool")
149
+ st.markdown("This tool helps diagnose issues with the LangGraph agent deployment.")
150
+
151
+ # Environment check
152
+ check_environment()
153
+
154
+ st.markdown("---")
155
+
156
+ # Dataset test
157
+ test_dataset_loading()
158
+
159
+ st.markdown("---")
160
+
161
+ # Agent test
162
+ test_simple_agent()
163
+
164
+ st.markdown("---")
165
+
166
+ st.markdown("## πŸ’‘ Common Solutions")
167
+ st.markdown(
168
+ """
169
+ **If agent creation fails:**
170
+ - Check API key is correctly set as Space secret
171
+ - Verify all dependencies are in requirements.txt
172
+ - Check for import errors above
173
+
174
+ **If agent hangs on 'thinking':**
175
+ - API key might be invalid/expired
176
+ - Network connectivity issues to API endpoint
177
+ - Unhandled exceptions in LangGraph workflow
178
+
179
+ **If dataset loading fails:**
180
+ - Network connectivity issues
181
+ - Hugging Face datasets library not properly installed
182
+ """
183
+ )
184
+
185
+
186
+ if __name__ == "__main__":
187
+ main()