Spaces:

SaritMeshesha
/

langraph-llm-data-analyst-agent

Sleeping

App Files Files Community

SaritMeshesha commited on Jul 12

Commit

186205a

verified ·

1 Parent(s): 6bcc4cb

Upload 2 files

Browse files

Files changed (2) hide show

README.md +27 -309
debug_app.py +187 -0

README.md CHANGED Viewed

@@ -1,328 +1,46 @@
 ---
-title: LangGraph Data Analyst Agent
-emoji: 🤖
 colorFrom: blue
 colorTo: purple
 sdk: streamlit
 sdk_version: "1.28.0"
-app_file: app.py
 pinned: false
 license: mit
 ---
-# 🤖 LangGraph Data Analyst Agent
-An intelligent data analyst agent built with LangGraph that analyzes customer support conversations with advanced memory, conversation persistence, and query recommendations.
-## 🌟 Features
-### Core Functionality
-- **Multi-Agent Architecture**: Separate specialized agents for structured and unstructured queries
-- **Query Classification**: Automatic routing to appropriate agent based on query type
-- **Rich Tool Set**: Comprehensive tools for data analysis and insights
-### Advanced Memory & Persistence
-- **Session Management**: Persistent conversations across page reloads and browser sessions
-- **User Profile Tracking**: Agent learns and remembers user interests and preferences
-- **Conversation History**: Full context retention using LangGraph checkpointers
-- **Cross-Session Continuity**: Resume conversations using session IDs
-### Intelligent Recommendations
-- **Query Suggestions**: AI-powered recommendations based on conversation history
-- **Interactive Refinement**: Collaborative query building with the agent
-- **Context-Aware**: Suggestions based on user profile and previous interactions
-## 🏗️ Architecture
-The agent uses LangGraph's multi-agent architecture with the following components:
-```
-User Query → Classifier → [Structured Agent | Unstructured Agent | Recommender] → Summarizer → Response
-                ↓
-            Tool Nodes (Dataset Analysis Tools)
-```
-### Agent Types
-1. **Structured Agent**: Handles quantitative queries (statistics, examples, distributions)
-2. **Unstructured Agent**: Handles qualitative queries (summaries, insights, patterns)
-3. **Query Recommender**: Suggests follow-up questions based on context
-4. **Summarizer**: Updates user profile and conversation memory
-## 🚀 Setup Instructions
-### Prerequisites
-- **Python Version**: 3.9 or higher
-- **API Key**: OpenAI API key or Nebius API key
-- **For Hugging Face Spaces**: Ensure your API key is set as a Space secret
-### Installation
-1. **Clone the repository**:
-```bash
-git clone <repository-url>
-cd Agents
-```
-2. **Install dependencies**:
-```bash
-pip install -r requirements.txt
-```
-3. **Configure API Key**:
-Create a `.env` file in the project root:
-```bash
-# For OpenAI (recommended)
-OPENAI_API_KEY=your_openai_api_key_here
-# OR for Nebius
-NEBIUS_API_KEY=your_nebius_api_key_here
-```
-4. **Run the application**:
-```bash
-streamlit run app.py
-```
-5. **Access the app**:
-Open your browser to `http://localhost:8501`
-### Alternative Deployment
-#### For Hugging Face Spaces:
-1. **Fork or upload this repository to Hugging Face Spaces**
-2. **Set your API key as a Space secret:**
-   - Go to your Space settings
-   - Navigate to "Variables and secrets"
-   - Add a secret named `NEBIUS_API_KEY` or `OPENAI_API_KEY`
-   - Enter your API key as the value
-3. **The app will start automatically**
-#### For other cloud deployment:
-```bash
-export OPENAI_API_KEY=your_api_key_here
-# OR
-export NEBIUS_API_KEY=your_api_key_here
-```
-## 🎯 Usage Guide
-### Query Types
-#### Structured Queries (Quantitative Analysis)
-- "How many records are in each category?"
-- "What are the most common customer issues?"
-- "Show me 5 examples of billing problems"
-- "Get distribution of intents"
-#### Unstructured Queries (Qualitative Analysis)
-- "Summarize the refund category"
-- "What patterns do you see in payment issues?"
-- "Analyze customer sentiment in billing conversations"
-- "What insights can you provide about technical support?"
-#### Memory & Recommendations
-- "What do you remember about me?"
-- "What should I query next?"
-- "Advise me what to explore"
-- "Recommend follow-up questions"
-### Session Management
-#### Creating Sessions
-- **New Session**: Click "🆕 New Session" to start fresh
-- **Auto-Generated**: Each new browser session gets a unique ID
-#### Resuming Sessions
-1. Copy your session ID from the sidebar (e.g., `a1b2c3d4...`)
-2. Enter the full session ID in "Join Existing Session"
-3. Click "🔗 Join Session" to resume
-#### Cross-Tab Persistence
-- Open multiple tabs with the same session ID
-- Conversations sync across all tabs
-- Memory and user profile persist
-## 🧠 Memory System
-### User Profile Tracking
-The agent automatically tracks:
-- **Interests**: Topics and categories you frequently ask about
-- **Expertise Level**: Inferred from question complexity (beginner/intermediate/advanced)
-- **Preferences**: Analysis style preferences (quantitative vs qualitative)
-- **Query History**: Recent questions for context
-### Conversation Persistence
-- **Thread-based**: Each session has a unique thread ID
-- **Checkpoint System**: LangGraph automatically saves state after each interaction
-- **Cross-Session**: Resume conversations days or weeks later
-### Memory Queries
-Ask the agent what it remembers:
-```
-"What do you remember about me?"
-"What are my interests?"
-"What have I asked about before?"
-```
-## 🔧 Testing the Agent
-### Basic Functionality Tests
-1. **Classification Test**:
-```
-Query: "How many categories are there?"
-Expected: Routes to Structured Agent → Uses get_dataset_stats tool
-```
-2. **Follow-up Memory Test**:
-```
-Query 1: "Show me billing examples"
-Query 2: "Show me more examples"
-Expected: Agent remembers previous context about billing
-```
-3. **User Profile Test**:
-```
-Query 1: "I'm interested in refund patterns"
-Query 2: "What do you remember about me?"
-Expected: Agent mentions interest in refunds
-```
-4. **Recommendation Test**:
-```
-Query: "What should I query next?"
-Expected: Personalized suggestions based on history
-```
-### Advanced Feature Tests
-1. **Session Persistence**:
-   - Ask a question, reload the page
-   - Verify conversation history remains
-   - Verify user profile persists
-2. **Cross-Session Memory**:
-   - Note your session ID
-   - Close browser completely
-   - Reopen and join the same session
-   - Verify full conversation and profile restoration
-3. **Interactive Recommendations**:
-```
-User: "Advise me what to query next"
-Agent: "Based on your interest in billing, you might want to analyze refund patterns."
-User: "I'd rather see examples instead"
-Agent: "Then I suggest showing 5 examples of refund requests."
-User: "Please do so"
-Expected: Agent executes the refined query
-```
-## 📁 File Structure
-```
-Agents/
-├── README.md                 # This file
-├── requirements.txt          # Python dependencies
-├── .env                     # API keys (create this)
-├── app.py                   # LangGraph Streamlit app
-├── langgraph_agent.py       # LangGraph agent implementation
-├── agent-memory.ipynb       # Memory example notebook
-├── test_agent.py            # Test suite
-└── DEPLOYMENT_GUIDE.md      # Original deployment guide
-```
-## 🛠️ Technical Implementation
-### LangGraph Components
-**State Management**:
-```python
-class AgentState(TypedDict):
-    messages: List[Any]
-    query_type: Optional[str]
-    user_profile: Optional[Dict[str, Any]]
-    session_context: Optional[Dict[str, Any]]
-```
-**Tool Categories**:
-- **Structured Tools**: Statistics, distributions, examples, search
-- **Unstructured Tools**: Summaries, insights, pattern analysis
-- **Memory Tools**: Profile updates, preference tracking
-**Graph Flow**:
-1. **Classifier**: Determines query type
-2. **Agent Selection**: Routes to appropriate specialist
-3. **Tool Execution**: Dynamic tool usage based on needs
-4. **Memory Update**: Profile and context updates
-5. **Response Generation**: Final answer with memory integration
-### Memory Architecture
-**Checkpointer**: LangGraph's `MemorySaver` for conversation persistence
-**Thread Management**: Unique thread IDs for session isolation
-**Profile Synthesis**: LLM-powered extraction of user characteristics
-**Context Retention**: Full conversation history with temporal awareness
-## 🔍 Troubleshooting
-### Common Issues
-1. **API Key Errors**:
-   - Verify `.env` file exists and has correct key
-   - Check environment variable is set in deployment
-   - Ensure API key has sufficient credits
-2. **Memory Not Persisting**:
-   - Verify session ID remains consistent
-   - Check browser localStorage not being cleared
-   - Ensure thread_id parameter is passed correctly
-3. **Dataset Loading Issues**:
-   - Check internet connection for Hugging Face datasets
-   - Verify datasets library is installed
-   - Try clearing Streamlit cache: `streamlit cache clear`
-4. **Tool Execution Errors**:
-   - Verify all dependencies in requirements.txt are installed
-   - Check dataset is properly loaded
-   - Review error messages in Streamlit interface
-### Debug Mode
-Enable debug logging by setting:
-```python
-import logging
-logging.basicConfig(level=logging.DEBUG)
-```
-## 🎓 Learning Objectives
-This implementation demonstrates:
-1. **LangGraph Multi-Agent Systems**: Specialized agents for different query types
-2. **Memory & Persistence**: Conversation continuity across sessions
-3. **Tool Integration**: Dynamic tool selection and execution
-4. **State Management**: Complex state updates and routing
-5. **User Experience**: Session management and interactive features
-## 🚀 Future Enhancements
-Potential improvements:
-- **Database Persistence**: Replace MemorySaver with PostgreSQL checkpointer
-- **Advanced Analytics**: More sophisticated data analysis tools
-- **Export Features**: PDF/CSV report generation
-- **User Authentication**: Multi-user support with profiles
-- **Real-time Collaboration**: Shared sessions between users
-## 📄 License
-This project is for educational purposes as part of a data science curriculum.
-## 🤝 Contributing
-This is an assignment project. For questions or issues, please contact the course instructors.
----
-**Built with**: LangGraph, Streamlit, OpenAI/Nebius, Hugging Face Datasets

 ---
+title: LangGraph Data Analyst Agent (Debug)
+emoji: 🔧
 colorFrom: blue
 colorTo: purple
 sdk: streamlit
 sdk_version: "1.28.0"
+app_file: debug_app.py
 pinned: false
 license: mit
 ---
+# 🔧 LangGraph Data Analyst Agent - Debug Mode
+**Temporary debug version to diagnose deployment issues**
+This debug tool will help identify what's causing the "thinking" hang in your deployment.
+## 🚀 Quick Steps:
+1. **Upload `debug_app.py` to your Space**
+2. **Replace your README.md with this version**
+3. **Wait for Space to restart**
+4. **Run the debug tests**
+5. **Check the results and error messages**
+## 🔍 What This Debug Tool Checks:
+- ✅ Python environment and packages
+- ✅ API key configuration
+- ✅ LangGraph agent import
+- ✅ Dataset loading
+- ✅ Simple agent test
+- ✅ Error details and stack traces
+## 📋 Expected Results:
+The debug tool will show you exactly where the problem is:
+- Import errors
+- API key issues
+- Network connectivity problems
+- LangGraph workflow errors
+## 🔧 After Debugging:
+Once you identify the issue, switch back to the main app by updating README.md to use `app_file: app.py`

debug_app.py ADDED Viewed

	@@ -0,0 +1,187 @@

+import json
+import os
+import traceback
+import uuid
+from datetime import datetime
+from typing import Dict
+import pandas as pd
+import streamlit as st
+from datasets import load_dataset
+from dotenv import load_dotenv
+# Only import if file exists
+try:
+    from langgraph_agent import DataAnalystAgent
+    AGENT_AVAILABLE = True
+except ImportError as e:
+    AGENT_AVAILABLE = False
+    IMPORT_ERROR = str(e)
+# Load environment variables
+load_dotenv()
+# Set up page config
+st.set_page_config(
+    page_title="🤖 LangGraph Data Analyst Agent (Debug)",
+    layout="wide",
+    page_icon="🤖",
+    initial_sidebar_state="expanded",
+)
+def check_environment():
+    """Check the deployment environment and dependencies."""
+    st.markdown("## 🔍 Environment Debug Info")
+    # Check Python version
+    import sys
+    st.write(f"**Python Version:** {sys.version}")
+    # Check if running on Hugging Face
+    is_hf_space = os.environ.get("SPACE_ID") is not None
+    st.write(f"**Running on Hugging Face Spaces:** {is_hf_space}")
+    if is_hf_space:
+        st.write(f"**Space ID:** {os.environ.get('SPACE_ID', 'Unknown')}")
+    # Check API key availability
+    nebius_key = os.environ.get("NEBIUS_API_KEY")
+    openai_key = os.environ.get("OPENAI_API_KEY")
+    st.write(f"**Nebius API Key Available:** {'Yes' if nebius_key else 'No'}")
+    st.write(f"**OpenAI API Key Available:** {'Yes' if openai_key else 'No'}")
+    if nebius_key:
+        st.write(f"**Nebius Key Length:** {len(nebius_key)} characters")
+    if openai_key:
+        st.write(f"**OpenAI Key Length:** {len(openai_key)} characters")
+    # Check agent import
+    st.write(
+        f"**LangGraph Agent Import:** {'✅ Success' if AGENT_AVAILABLE else '❌ Failed'}"
+    )
+    if not AGENT_AVAILABLE:
+        st.error(f"Import Error: {IMPORT_ERROR}")
+    # Check required packages
+    required_packages = [
+        "langchain",
+        "langchain_core",
+        "langchain_openai",
+        "langgraph",
+        "datasets",
+        "pandas",
+    ]
+    st.markdown("### 📦 Package Availability")
+    for package in required_packages:
+        try:
+            __import__(package)
+            st.write(f"✅ {package}")
+        except ImportError as e:
+            st.write(f"❌ {package} - {str(e)}")
+def test_simple_agent():
+    """Test basic agent functionality."""
+    if not AGENT_AVAILABLE:
+        st.error("Cannot test agent - import failed")
+        return
+    st.markdown("## 🧪 Agent Test")
+    # Get API key
+    api_key = os.environ.get("NEBIUS_API_KEY") or os.environ.get("OPENAI_API_KEY")
+    if not api_key:
+        st.error("No API key found!")
+        return
+    st.write("**API Key:** ✅ Available")
+    # Test agent creation
+    try:
+        st.write("**Creating Agent...**")
+        agent = DataAnalystAgent(api_key=api_key)
+        st.write("✅ Agent created successfully")
+        # Test simple query
+        if st.button("🧪 Test Simple Query"):
+            with st.spinner("Testing agent with simple query..."):
+                try:
+                    result = agent.invoke("Hello, are you working?", "debug_test")
+                    st.success("✅ Agent responded successfully!")
+                    st.markdown("**Response Messages:**")
+                    for i, msg in enumerate(result.get("messages", [])):
+                        st.write(
+                            f"{i+1}. {type(msg).__name__}: {getattr(msg, 'content', 'No content')[:100]}..."
+                        )
+                except Exception as e:
+                    st.error(f"❌ Agent test failed: {str(e)}")
+                    st.code(traceback.format_exc())
+    except Exception as e:
+        st.error(f"❌ Agent creation failed: {str(e)}")
+        st.code(traceback.format_exc())
+def test_dataset_loading():
+    """Test dataset loading."""
+    st.markdown("## 📊 Dataset Test")
+    try:
+        with st.spinner("Loading dataset..."):
+            dataset = load_dataset(
+                "bitext/Bitext-customer-support-llm-chatbot-training-dataset"
+            )
+            df = pd.DataFrame(dataset["train"])
+            st.success(f"✅ Dataset loaded: {len(df):,} records")
+            st.dataframe(df.head(3))
+    except Exception as e:
+        st.error(f"❌ Dataset loading failed: {str(e)}")
+        st.code(traceback.format_exc())
+def main():
+    st.title("🔧 LangGraph Agent Debug Tool")
+    st.markdown("This tool helps diagnose issues with the LangGraph agent deployment.")
+    # Environment check
+    check_environment()
+    st.markdown("---")
+    # Dataset test
+    test_dataset_loading()
+    st.markdown("---")
+    # Agent test
+    test_simple_agent()
+    st.markdown("---")
+    st.markdown("## 💡 Common Solutions")
+    st.markdown(
+        """
+    **If agent creation fails:**
+    - Check API key is correctly set as Space secret
+    - Verify all dependencies are in requirements.txt
+    - Check for import errors above
+    **If agent hangs on 'thinking':**
+    - API key might be invalid/expired
+    - Network connectivity issues to API endpoint
+    - Unhandled exceptions in LangGraph workflow
+    **If dataset loading fails:**
+    - Network connectivity issues
+    - Hugging Face datasets library not properly installed
+    """
+    )
+if __name__ == "__main__":
+    main()