A newer version of the Streamlit SDK is available:
1.48.0
title: LangGraph Data Analyst Agent
emoji: π€
colorFrom: blue
colorTo: purple
sdk: streamlit
sdk_version: 1.28.0
app_file: app.py
pinned: false
license: mit
π€ LangGraph Data Analyst Agent
An intelligent data analyst agent built with LangGraph that analyzes customer support conversations with advanced memory, conversation persistence, and query recommendations.
π Features
Core Functionality
- Multi-Agent Architecture: Separate specialized agents for structured and unstructured queries
- Query Classification: Automatic routing to appropriate agent based on query type
- Rich Tool Set: Comprehensive tools for data analysis and insights
Advanced Memory & Persistence
- Session Management: Persistent conversations across page reloads and browser sessions
- User Profile Tracking: Agent learns and remembers user interests and preferences
- Conversation History: Full context retention using LangGraph checkpointers
- Cross-Session Continuity: Resume conversations using session IDs
Intelligent Recommendations
- Query Suggestions: AI-powered recommendations based on conversation history
- Interactive Refinement: Collaborative query building with the agent
- Context-Aware: Suggestions based on user profile and previous interactions
ποΈ Architecture
The agent uses LangGraph's multi-agent architecture with the following components:
User Query β Classifier β [Structured Agent | Unstructured Agent | Recommender] β Summarizer β Response
β
Tool Nodes (Dataset Analysis Tools)
Agent Types
- Structured Agent: Handles quantitative queries (statistics, examples, distributions)
- Unstructured Agent: Handles qualitative queries (summaries, insights, patterns)
- Query Recommender: Suggests follow-up questions based on context
- Summarizer: Updates user profile and conversation memory
π Setup Instructions
Prerequisites
- Python Version: 3.9 or higher
- API Key: OpenAI API key or Nebius API key
- For Hugging Face Spaces: Ensure your API key is set as a Space secret
Installation
- Clone the repository:
git clone <repository-url>
cd Agents
- Install dependencies:
pip install -r requirements.txt
- Configure API Key:
Create a .env
file in the project root:
# For OpenAI (recommended)
OPENAI_API_KEY=your_openai_api_key_here
# OR for Nebius
NEBIUS_API_KEY=your_nebius_api_key_here
- Run the application:
streamlit run app.py
- Access the app:
Open your browser to
http://localhost:8501
Alternative Deployment
For Hugging Face Spaces:
- Fork or upload this repository to Hugging Face Spaces
- Set your API key as a Space secret:
- Go to your Space settings
- Navigate to "Variables and secrets"
- Add a secret named
NEBIUS_API_KEY
orOPENAI_API_KEY
- Enter your API key as the value
- The app will start automatically
For other cloud deployment:
export OPENAI_API_KEY=your_api_key_here
# OR
export NEBIUS_API_KEY=your_api_key_here
π― Usage Guide
Query Types
Structured Queries (Quantitative Analysis)
- "How many records are in each category?"
- "What are the most common customer issues?"
- "Show me 5 examples of billing problems"
- "Get distribution of intents"
Unstructured Queries (Qualitative Analysis)
- "Summarize the refund category"
- "What patterns do you see in payment issues?"
- "Analyze customer sentiment in billing conversations"
- "What insights can you provide about technical support?"
Memory & Recommendations
- "What do you remember about me?"
- "What should I query next?"
- "Advise me what to explore"
- "Recommend follow-up questions"
Session Management
Creating Sessions
- New Session: Click "π New Session" to start fresh
- Auto-Generated: Each new browser session gets a unique ID
Resuming Sessions
- Copy your session ID from the sidebar (e.g.,
a1b2c3d4...
) - Enter the full session ID in "Join Existing Session"
- Click "π Join Session" to resume
Cross-Tab Persistence
- Open multiple tabs with the same session ID
- Conversations sync across all tabs
- Memory and user profile persist
π§ Memory System
User Profile Tracking
The agent automatically tracks:
- Interests: Topics and categories you frequently ask about
- Expertise Level: Inferred from question complexity (beginner/intermediate/advanced)
- Preferences: Analysis style preferences (quantitative vs qualitative)
- Query History: Recent questions for context
Conversation Persistence
- Thread-based: Each session has a unique thread ID
- Checkpoint System: LangGraph automatically saves state after each interaction
- Cross-Session: Resume conversations days or weeks later
Memory Queries
Ask the agent what it remembers:
"What do you remember about me?"
"What are my interests?"
"What have I asked about before?"
π§ Testing the Agent
Basic Functionality Tests
- Classification Test:
Query: "How many categories are there?"
Expected: Routes to Structured Agent β Uses get_dataset_stats tool
- Follow-up Memory Test:
Query 1: "Show me billing examples"
Query 2: "Show me more examples"
Expected: Agent remembers previous context about billing
- User Profile Test:
Query 1: "I'm interested in refund patterns"
Query 2: "What do you remember about me?"
Expected: Agent mentions interest in refunds
- Recommendation Test:
Query: "What should I query next?"
Expected: Personalized suggestions based on history
Advanced Feature Tests
Session Persistence:
- Ask a question, reload the page
- Verify conversation history remains
- Verify user profile persists
Cross-Session Memory:
- Note your session ID
- Close browser completely
- Reopen and join the same session
- Verify full conversation and profile restoration
Interactive Recommendations:
User: "Advise me what to query next"
Agent: "Based on your interest in billing, you might want to analyze refund patterns."
User: "I'd rather see examples instead"
Agent: "Then I suggest showing 5 examples of refund requests."
User: "Please do so"
Expected: Agent executes the refined query
π File Structure
Agents/
βββ README.md # This file
βββ requirements.txt # Python dependencies
βββ .env # API keys (create this)
βββ app.py # LangGraph Streamlit app
βββ langgraph_agent.py # LangGraph agent implementation
βββ agent-memory.ipynb # Memory example notebook
βββ test_agent.py # Test suite
βββ DEPLOYMENT_GUIDE.md # Original deployment guide
π οΈ Technical Implementation
LangGraph Components
State Management:
class AgentState(TypedDict):
messages: List[Any]
query_type: Optional[str]
user_profile: Optional[Dict[str, Any]]
session_context: Optional[Dict[str, Any]]
Tool Categories:
- Structured Tools: Statistics, distributions, examples, search
- Unstructured Tools: Summaries, insights, pattern analysis
- Memory Tools: Profile updates, preference tracking
Graph Flow:
- Classifier: Determines query type
- Agent Selection: Routes to appropriate specialist
- Tool Execution: Dynamic tool usage based on needs
- Memory Update: Profile and context updates
- Response Generation: Final answer with memory integration
Memory Architecture
Checkpointer: LangGraph's MemorySaver
for conversation persistence
Thread Management: Unique thread IDs for session isolation
Profile Synthesis: LLM-powered extraction of user characteristics
Context Retention: Full conversation history with temporal awareness
π Troubleshooting
Common Issues
API Key Errors:
- Verify
.env
file exists and has correct key - Check environment variable is set in deployment
- Ensure API key has sufficient credits
- Verify
Memory Not Persisting:
- Verify session ID remains consistent
- Check browser localStorage not being cleared
- Ensure thread_id parameter is passed correctly
Dataset Loading Issues:
- Check internet connection for Hugging Face datasets
- Verify datasets library is installed
- Try clearing Streamlit cache:
streamlit cache clear
Tool Execution Errors:
- Verify all dependencies in requirements.txt are installed
- Check dataset is properly loaded
- Review error messages in Streamlit interface
Debug Mode
Enable debug logging by setting:
import logging
logging.basicConfig(level=logging.DEBUG)
π Learning Objectives
This implementation demonstrates:
- LangGraph Multi-Agent Systems: Specialized agents for different query types
- Memory & Persistence: Conversation continuity across sessions
- Tool Integration: Dynamic tool selection and execution
- State Management: Complex state updates and routing
- User Experience: Session management and interactive features
π Future Enhancements
Potential improvements:
- Database Persistence: Replace MemorySaver with PostgreSQL checkpointer
- Advanced Analytics: More sophisticated data analysis tools
- Export Features: PDF/CSV report generation
- User Authentication: Multi-user support with profiles
- Real-time Collaboration: Shared sessions between users
π License
This project is for educational purposes as part of a data science curriculum.
π€ Contributing
This is an assignment project. For questions or issues, please contact the course instructors.
Built with: LangGraph, Streamlit, OpenAI/Nebius, Hugging Face Datasets