File size: 9,954 Bytes
6bcc4cb 3e090a6 6bcc4cb 3e090a6 6bcc4cb 3e090a6 b2706cf 3e090a6 b2706cf 3e090a6 b2706cf 3e090a6 b2706cf 3e090a6 b2706cf 3e090a6 b2706cf 3e090a6 b2706cf 3e090a6 b2706cf 3e090a6 b2706cf 3e090a6 b2706cf 3e090a6 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 |
---
title: LangGraph Data Analyst Agent
emoji: π€
colorFrom: blue
colorTo: purple
sdk: streamlit
sdk_version: "1.28.0"
app_file: app.py
pinned: false
license: mit
---
# π€ LangGraph Data Analyst Agent
An intelligent data analyst agent built with LangGraph that analyzes customer support conversations with advanced memory, conversation persistence, and query recommendations.
## π Features
### Core Functionality
- **Multi-Agent Architecture**: Separate specialized agents for structured and unstructured queries
- **Query Classification**: Automatic routing to appropriate agent based on query type
- **Rich Tool Set**: Comprehensive tools for data analysis and insights
### Advanced Memory & Persistence
- **Session Management**: Persistent conversations across page reloads and browser sessions
- **User Profile Tracking**: Agent learns and remembers user interests and preferences
- **Conversation History**: Full context retention using LangGraph checkpointers
- **Cross-Session Continuity**: Resume conversations using session IDs
### Intelligent Recommendations
- **Query Suggestions**: AI-powered recommendations based on conversation history
- **Interactive Refinement**: Collaborative query building with the agent
- **Context-Aware**: Suggestions based on user profile and previous interactions
## ποΈ Architecture
The agent uses LangGraph's multi-agent architecture with the following components:
```
User Query β Classifier β [Structured Agent | Unstructured Agent | Recommender] β Summarizer β Response
β
Tool Nodes (Dataset Analysis Tools)
```
### Agent Types
1. **Structured Agent**: Handles quantitative queries (statistics, examples, distributions)
2. **Unstructured Agent**: Handles qualitative queries (summaries, insights, patterns)
3. **Query Recommender**: Suggests follow-up questions based on context
4. **Summarizer**: Updates user profile and conversation memory
## π Setup Instructions
### Prerequisites
- **Python Version**: 3.9 or higher
- **API Key**: OpenAI API key or Nebius API key
- **For Hugging Face Spaces**: Ensure your API key is set as a Space secret
### Installation
1. **Clone the repository**:
```bash
git clone <repository-url>
cd Agents
```
2. **Install dependencies**:
```bash
pip install -r requirements.txt
```
3. **Configure API Key**:
Create a `.env` file in the project root:
```bash
# For OpenAI (recommended)
OPENAI_API_KEY=your_openai_api_key_here
# OR for Nebius
NEBIUS_API_KEY=your_nebius_api_key_here
```
4. **Run the application**:
```bash
streamlit run app.py
```
5. **Access the app**:
Open your browser to `http://localhost:8501`
### Alternative Deployment
#### For Hugging Face Spaces:
1. **Fork or upload this repository to Hugging Face Spaces**
2. **Set your API key as a Space secret:**
- Go to your Space settings
- Navigate to "Variables and secrets"
- Add a secret named `NEBIUS_API_KEY` or `OPENAI_API_KEY`
- Enter your API key as the value
3. **The app will start automatically**
#### For other cloud deployment:
```bash
export OPENAI_API_KEY=your_api_key_here
# OR
export NEBIUS_API_KEY=your_api_key_here
```
## π― Usage Guide
### Query Types
#### Structured Queries (Quantitative Analysis)
- "How many records are in each category?"
- "What are the most common customer issues?"
- "Show me 5 examples of billing problems"
- "Get distribution of intents"
#### Unstructured Queries (Qualitative Analysis)
- "Summarize the refund category"
- "What patterns do you see in payment issues?"
- "Analyze customer sentiment in billing conversations"
- "What insights can you provide about technical support?"
#### Memory & Recommendations
- "What do you remember about me?"
- "What should I query next?"
- "Advise me what to explore"
- "Recommend follow-up questions"
### Session Management
#### Creating Sessions
- **New Session**: Click "π New Session" to start fresh
- **Auto-Generated**: Each new browser session gets a unique ID
#### Resuming Sessions
1. Copy your session ID from the sidebar (e.g., `a1b2c3d4...`)
2. Enter the full session ID in "Join Existing Session"
3. Click "π Join Session" to resume
#### Cross-Tab Persistence
- Open multiple tabs with the same session ID
- Conversations sync across all tabs
- Memory and user profile persist
## π§ Memory System
### User Profile Tracking
The agent automatically tracks:
- **Interests**: Topics and categories you frequently ask about
- **Expertise Level**: Inferred from question complexity (beginner/intermediate/advanced)
- **Preferences**: Analysis style preferences (quantitative vs qualitative)
- **Query History**: Recent questions for context
### Conversation Persistence
- **Thread-based**: Each session has a unique thread ID
- **Checkpoint System**: LangGraph automatically saves state after each interaction
- **Cross-Session**: Resume conversations days or weeks later
### Memory Queries
Ask the agent what it remembers:
```
"What do you remember about me?"
"What are my interests?"
"What have I asked about before?"
```
## π§ Testing the Agent
### Basic Functionality Tests
1. **Classification Test**:
```
Query: "How many categories are there?"
Expected: Routes to Structured Agent β Uses get_dataset_stats tool
```
2. **Follow-up Memory Test**:
```
Query 1: "Show me billing examples"
Query 2: "Show me more examples"
Expected: Agent remembers previous context about billing
```
3. **User Profile Test**:
```
Query 1: "I'm interested in refund patterns"
Query 2: "What do you remember about me?"
Expected: Agent mentions interest in refunds
```
4. **Recommendation Test**:
```
Query: "What should I query next?"
Expected: Personalized suggestions based on history
```
### Advanced Feature Tests
1. **Session Persistence**:
- Ask a question, reload the page
- Verify conversation history remains
- Verify user profile persists
2. **Cross-Session Memory**:
- Note your session ID
- Close browser completely
- Reopen and join the same session
- Verify full conversation and profile restoration
3. **Interactive Recommendations**:
```
User: "Advise me what to query next"
Agent: "Based on your interest in billing, you might want to analyze refund patterns."
User: "I'd rather see examples instead"
Agent: "Then I suggest showing 5 examples of refund requests."
User: "Please do so"
Expected: Agent executes the refined query
```
## π File Structure
```
Agents/
βββ README.md # This file
βββ requirements.txt # Python dependencies
βββ .env # API keys (create this)
βββ app.py # LangGraph Streamlit app
βββ langgraph_agent.py # LangGraph agent implementation
βββ agent-memory.ipynb # Memory example notebook
βββ test_agent.py # Test suite
βββ DEPLOYMENT_GUIDE.md # Original deployment guide
```
## π οΈ Technical Implementation
### LangGraph Components
**State Management**:
```python
class AgentState(TypedDict):
messages: List[Any]
query_type: Optional[str]
user_profile: Optional[Dict[str, Any]]
session_context: Optional[Dict[str, Any]]
```
**Tool Categories**:
- **Structured Tools**: Statistics, distributions, examples, search
- **Unstructured Tools**: Summaries, insights, pattern analysis
- **Memory Tools**: Profile updates, preference tracking
**Graph Flow**:
1. **Classifier**: Determines query type
2. **Agent Selection**: Routes to appropriate specialist
3. **Tool Execution**: Dynamic tool usage based on needs
4. **Memory Update**: Profile and context updates
5. **Response Generation**: Final answer with memory integration
### Memory Architecture
**Checkpointer**: LangGraph's `MemorySaver` for conversation persistence
**Thread Management**: Unique thread IDs for session isolation
**Profile Synthesis**: LLM-powered extraction of user characteristics
**Context Retention**: Full conversation history with temporal awareness
## π Troubleshooting
### Common Issues
1. **API Key Errors**:
- Verify `.env` file exists and has correct key
- Check environment variable is set in deployment
- Ensure API key has sufficient credits
2. **Memory Not Persisting**:
- Verify session ID remains consistent
- Check browser localStorage not being cleared
- Ensure thread_id parameter is passed correctly
3. **Dataset Loading Issues**:
- Check internet connection for Hugging Face datasets
- Verify datasets library is installed
- Try clearing Streamlit cache: `streamlit cache clear`
4. **Tool Execution Errors**:
- Verify all dependencies in requirements.txt are installed
- Check dataset is properly loaded
- Review error messages in Streamlit interface
### Debug Mode
Enable debug logging by setting:
```python
import logging
logging.basicConfig(level=logging.DEBUG)
```
## π Learning Objectives
This implementation demonstrates:
1. **LangGraph Multi-Agent Systems**: Specialized agents for different query types
2. **Memory & Persistence**: Conversation continuity across sessions
3. **Tool Integration**: Dynamic tool selection and execution
4. **State Management**: Complex state updates and routing
5. **User Experience**: Session management and interactive features
## π Future Enhancements
Potential improvements:
- **Database Persistence**: Replace MemorySaver with PostgreSQL checkpointer
- **Advanced Analytics**: More sophisticated data analysis tools
- **Export Features**: PDF/CSV report generation
- **User Authentication**: Multi-user support with profiles
- **Real-time Collaboration**: Shared sessions between users
## π License
This project is for educational purposes as part of a data science curriculum.
## π€ Contributing
This is an assignment project. For questions or issues, please contact the course instructors.
---
**Built with**: LangGraph, Streamlit, OpenAI/Nebius, Hugging Face Datasets |