Spaces:

SaritMeshesha
/

langraph-llm-data-analyst-agent

Sleeping

App Files Files Community

SaritMeshesha commited on Jul 12

Commit

3e090a6

verified ·

1 Parent(s): 186205a

Upload 2 files

Browse files

Files changed (2) hide show

README.md +309 -27
app.py +2 -2

README.md CHANGED Viewed

@@ -1,46 +1,328 @@
 ---
-title: LangGraph Data Analyst Agent (Debug)
-emoji: 🔧
 colorFrom: blue
 colorTo: purple
 sdk: streamlit
 sdk_version: "1.28.0"
-app_file: debug_app.py
 pinned: false
 license: mit
 ---
-# 🔧 LangGraph Data Analyst Agent - Debug Mode
-**Temporary debug version to diagnose deployment issues**
-This debug tool will help identify what's causing the "thinking" hang in your deployment.
-## 🚀 Quick Steps:
-1. **Upload `debug_app.py` to your Space**
-2. **Replace your README.md with this version**
-3. **Wait for Space to restart**
-4. **Run the debug tests**
-5. **Check the results and error messages**
-## 🔍 What This Debug Tool Checks:
-- ✅ Python environment and packages
-- ✅ API key configuration
-- ✅ LangGraph agent import
-- ✅ Dataset loading
-- ✅ Simple agent test
-- ✅ Error details and stack traces
-## 📋 Expected Results:
-The debug tool will show you exactly where the problem is:
-- Import errors
-- API key issues
-- Network connectivity problems
-- LangGraph workflow errors
-## 🔧 After Debugging:
-Once you identify the issue, switch back to the main app by updating README.md to use `app_file: app.py`

 ---
+title: LangGraph Data Analyst Agent
+emoji: 🤖
 colorFrom: blue
 colorTo: purple
 sdk: streamlit
 sdk_version: "1.28.0"
+app_file: app.py
 pinned: false
 license: mit
 ---
+# 🤖 LangGraph Data Analyst Agent
+An intelligent data analyst agent built with LangGraph that analyzes customer support conversations with advanced memory, conversation persistence, and query recommendations.
+## 🌟 Features
+### Core Functionality
+- **Multi-Agent Architecture**: Separate specialized agents for structured and unstructured queries
+- **Query Classification**: Automatic routing to appropriate agent based on query type
+- **Rich Tool Set**: Comprehensive tools for data analysis and insights
+### Advanced Memory & Persistence
+- **Session Management**: Persistent conversations across page reloads and browser sessions
+- **User Profile Tracking**: Agent learns and remembers user interests and preferences
+- **Conversation History**: Full context retention using LangGraph checkpointers
+- **Cross-Session Continuity**: Resume conversations using session IDs
+### Intelligent Recommendations
+- **Query Suggestions**: AI-powered recommendations based on conversation history
+- **Interactive Refinement**: Collaborative query building with the agent
+- **Context-Aware**: Suggestions based on user profile and previous interactions
+## 🏗️ Architecture
+The agent uses LangGraph's multi-agent architecture with the following components:
+```
+User Query → Classifier → [Structured Agent | Unstructured Agent | Recommender] → Summarizer → Response
+                ↓
+            Tool Nodes (Dataset Analysis Tools)
+```
+### Agent Types
+1. **Structured Agent**: Handles quantitative queries (statistics, examples, distributions)
+2. **Unstructured Agent**: Handles qualitative queries (summaries, insights, patterns)
+3. **Query Recommender**: Suggests follow-up questions based on context
+4. **Summarizer**: Updates user profile and conversation memory
+## 🚀 Setup Instructions
+### Prerequisites
+- **Python Version**: 3.9 or higher
+- **API Key**: OpenAI API key or Nebius API key
+- **For Hugging Face Spaces**: Ensure your API key is set as a Space secret
+### Installation
+1. **Clone the repository**:
+```bash
+git clone <repository-url>
+cd Agents
+```
+2. **Install dependencies**:
+```bash
+pip install -r requirements.txt
+```
+3. **Configure API Key**:
+Create a `.env` file in the project root:
+```bash
+# For OpenAI (recommended)
+OPENAI_API_KEY=your_openai_api_key_here
+# OR for Nebius
+NEBIUS_API_KEY=your_nebius_api_key_here
+```
+4. **Run the application**:
+```bash
+streamlit run app.py
+```
+5. **Access the app**:
+Open your browser to `http://localhost:8501`
+### Alternative Deployment
+#### For Hugging Face Spaces:
+1. **Fork or upload this repository to Hugging Face Spaces**
+2. **Set your API key as a Space secret:**
+   - Go to your Space settings
+   - Navigate to "Variables and secrets"
+   - Add a secret named `NEBIUS_API_KEY` or `OPENAI_API_KEY`
+   - Enter your API key as the value
+3. **The app will start automatically**
+#### For other cloud deployment:
+```bash
+export OPENAI_API_KEY=your_api_key_here
+# OR
+export NEBIUS_API_KEY=your_api_key_here
+```
+## 🎯 Usage Guide
+### Query Types
+#### Structured Queries (Quantitative Analysis)
+- "How many records are in each category?"
+- "What are the most common customer issues?"
+- "Show me 5 examples of billing problems"
+- "Get distribution of intents"
+#### Unstructured Queries (Qualitative Analysis)
+- "Summarize the refund category"
+- "What patterns do you see in payment issues?"
+- "Analyze customer sentiment in billing conversations"
+- "What insights can you provide about technical support?"
+#### Memory & Recommendations
+- "What do you remember about me?"
+- "What should I query next?"
+- "Advise me what to explore"
+- "Recommend follow-up questions"
+### Session Management
+#### Creating Sessions
+- **New Session**: Click "🆕 New Session" to start fresh
+- **Auto-Generated**: Each new browser session gets a unique ID
+#### Resuming Sessions
+1. Copy your session ID from the sidebar (e.g., `a1b2c3d4...`)
+2. Enter the full session ID in "Join Existing Session"
+3. Click "🔗 Join Session" to resume
+#### Cross-Tab Persistence
+- Open multiple tabs with the same session ID
+- Conversations sync across all tabs
+- Memory and user profile persist
+## 🧠 Memory System
+### User Profile Tracking
+The agent automatically tracks:
+- **Interests**: Topics and categories you frequently ask about
+- **Expertise Level**: Inferred from question complexity (beginner/intermediate/advanced)
+- **Preferences**: Analysis style preferences (quantitative vs qualitative)
+- **Query History**: Recent questions for context
+### Conversation Persistence
+- **Thread-based**: Each session has a unique thread ID
+- **Checkpoint System**: LangGraph automatically saves state after each interaction
+- **Cross-Session**: Resume conversations days or weeks later
+### Memory Queries
+Ask the agent what it remembers:
+```
+"What do you remember about me?"
+"What are my interests?"
+"What have I asked about before?"
+```
+## 🔧 Testing the Agent
+### Basic Functionality Tests
+1. **Classification Test**:
+```
+Query: "How many categories are there?"
+Expected: Routes to Structured Agent → Uses get_dataset_stats tool
+```
+2. **Follow-up Memory Test**:
+```
+Query 1: "Show me billing examples"
+Query 2: "Show me more examples"
+Expected: Agent remembers previous context about billing
+```
+3. **User Profile Test**:
+```
+Query 1: "I'm interested in refund patterns"
+Query 2: "What do you remember about me?"
+Expected: Agent mentions interest in refunds
+```
+4. **Recommendation Test**:
+```
+Query: "What should I query next?"
+Expected: Personalized suggestions based on history
+```
+### Advanced Feature Tests
+1. **Session Persistence**:
+   - Ask a question, reload the page
+   - Verify conversation history remains
+   - Verify user profile persists
+2. **Cross-Session Memory**:
+   - Note your session ID
+   - Close browser completely
+   - Reopen and join the same session
+   - Verify full conversation and profile restoration
+3. **Interactive Recommendations**:
+```
+User: "Advise me what to query next"
+Agent: "Based on your interest in billing, you might want to analyze refund patterns."
+User: "I'd rather see examples instead"
+Agent: "Then I suggest showing 5 examples of refund requests."
+User: "Please do so"
+Expected: Agent executes the refined query
+```
+## 📁 File Structure
+```
+Agents/
+├── README.md                 # This file
+├── requirements.txt          # Python dependencies
+├── .env                     # API keys (create this)
+├── app.py                   # LangGraph Streamlit app
+├── langgraph_agent.py       # LangGraph agent implementation
+├── agent-memory.ipynb       # Memory example notebook
+├── test_agent.py            # Test suite
+└── DEPLOYMENT_GUIDE.md      # Original deployment guide
+```
+## 🛠️ Technical Implementation
+### LangGraph Components
+**State Management**:
+```python
+class AgentState(TypedDict):
+    messages: List[Any]
+    query_type: Optional[str]
+    user_profile: Optional[Dict[str, Any]]
+    session_context: Optional[Dict[str, Any]]
+```
+**Tool Categories**:
+- **Structured Tools**: Statistics, distributions, examples, search
+- **Unstructured Tools**: Summaries, insights, pattern analysis
+- **Memory Tools**: Profile updates, preference tracking
+**Graph Flow**:
+1. **Classifier**: Determines query type
+2. **Agent Selection**: Routes to appropriate specialist
+3. **Tool Execution**: Dynamic tool usage based on needs
+4. **Memory Update**: Profile and context updates
+5. **Response Generation**: Final answer with memory integration
+### Memory Architecture
+**Checkpointer**: LangGraph's `MemorySaver` for conversation persistence
+**Thread Management**: Unique thread IDs for session isolation
+**Profile Synthesis**: LLM-powered extraction of user characteristics
+**Context Retention**: Full conversation history with temporal awareness
+## 🔍 Troubleshooting
+### Common Issues
+1. **API Key Errors**:
+   - Verify `.env` file exists and has correct key
+   - Check environment variable is set in deployment
+   - Ensure API key has sufficient credits
+2. **Memory Not Persisting**:
+   - Verify session ID remains consistent
+   - Check browser localStorage not being cleared
+   - Ensure thread_id parameter is passed correctly
+3. **Dataset Loading Issues**:
+   - Check internet connection for Hugging Face datasets
+   - Verify datasets library is installed
+   - Try clearing Streamlit cache: `streamlit cache clear`
+4. **Tool Execution Errors**:
+   - Verify all dependencies in requirements.txt are installed
+   - Check dataset is properly loaded
+   - Review error messages in Streamlit interface
+### Debug Mode
+Enable debug logging by setting:
+```python
+import logging
+logging.basicConfig(level=logging.DEBUG)
+```
+## 🎓 Learning Objectives
+This implementation demonstrates:
+1. **LangGraph Multi-Agent Systems**: Specialized agents for different query types
+2. **Memory & Persistence**: Conversation continuity across sessions
+3. **Tool Integration**: Dynamic tool selection and execution
+4. **State Management**: Complex state updates and routing
+5. **User Experience**: Session management and interactive features
+## 🚀 Future Enhancements
+Potential improvements:
+- **Database Persistence**: Replace MemorySaver with PostgreSQL checkpointer
+- **Advanced Analytics**: More sophisticated data analysis tools
+- **Export Features**: PDF/CSV report generation
+- **User Authentication**: Multi-user support with profiles
+- **Real-time Collaboration**: Shared sessions between users
+## 📄 License
+This project is for educational purposes as part of a data science curriculum.
+## 🤝 Contributing
+This is an assignment project. For questions or issues, please contact the course instructors.
+---
+**Built with**: LangGraph, Streamlit, OpenAI/Nebius, Hugging Face Datasets

app.py CHANGED Viewed

@@ -2,14 +2,14 @@ import json
 import os
 import uuid
 from datetime import datetime
-from typing import Dict, List, Optional
 import pandas as pd
 import streamlit as st
 from datasets import load_dataset
 from dotenv import load_dotenv
-from langgraph_agent import DataAnalystAgent, DatasetManager
 # Load environment variables
 load_dotenv()

 import os
 import uuid
 from datetime import datetime
+from typing import Dict
 import pandas as pd
 import streamlit as st
 from datasets import load_dataset
 from dotenv import load_dotenv
+from langgraph_agent import DataAnalystAgent
 # Load environment variables
 load_dotenv()