cosmoruler commited on
Commit
db6dcad
Β·
1 Parent(s): eb85a42

stuck already

Browse files
ENHANCEMENT_GUIDE.md CHANGED
@@ -19,11 +19,16 @@ python upload.py
19
  # Choose option 1 when prompted
20
  ```
21
 
22
- #### Option 2: Enhanced Interactive Mode
23
 
24
  ```bash
25
  python upload.py
26
  # Choose option 2 when prompted
 
 
 
 
 
27
  ```
28
 
29
  #### Option 3: Demo Script
@@ -32,6 +37,36 @@ python upload.py
32
  python demo_enhanced.py
33
  ```
34
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
35
  ### Setting Up AI Features:
36
 
37
  #### For OpenAI (Recommended):
@@ -58,24 +93,55 @@ python demo_enhanced.py
58
 
59
  ### Example AI Queries:
60
 
61
- Once configured, you can ask:
62
 
63
- - "What are the main trends in this data?"
64
- - "Find any outliers or anomalies"
 
 
65
  - "Suggest data quality improvements"
66
- - "Perform correlation analysis"
67
- - "Identify seasonal patterns"
68
  - "Recommend preprocessing steps"
69
 
 
 
 
 
 
 
 
70
  ### Features Available Without AI:
71
 
72
  Even without AI configuration, you get:
73
 
74
  - βœ… Data loading and exploration (original functionality)
75
- - βœ… Statistical summaries
76
  - βœ… Data visualization (histograms, correlation heatmaps)
77
- - βœ… Data quality analysis
78
- - βœ… Missing value analysis
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
79
 
80
  ### Files Structure:
81
 
@@ -87,9 +153,27 @@ Even without AI configuration, you get:
87
 
88
  ### Quick Start:
89
 
90
- 1. **Test the script**: `python upload.py`
91
- 2. **Try enhanced mode**: Choose option 2
92
- 3. **Configure AI**: Edit `setup_agent()` method
93
- 4. **Ask AI questions**: Use menu option 4
 
 
 
 
 
 
 
 
 
 
 
94
 
95
  πŸš€ **Your original functionality is preserved - nothing is broken!**
 
 
 
 
 
 
 
 
19
  # Choose option 1 when prompted
20
  ```
21
 
22
+ #### Option 2: Enhanced Interactive Mode ⚠️ **IMPORTANT WORKFLOW**
23
 
24
  ```bash
25
  python upload.py
26
  # Choose option 2 when prompted
27
+ # THEN FOLLOW THIS EXACT SEQUENCE:
28
+ # 1. Choose option 1 (Load and explore data) ← MUST DO THIS FIRST!
29
+ # 2. Wait for data to load completely
30
+ # 3. Choose option 4 (AI-powered analysis)
31
+ # 4. Type your question (e.g., "identify seasonal patterns")
32
  ```
33
 
34
  #### Option 3: Demo Script
 
37
  python demo_enhanced.py
38
  ```
39
 
40
+ ### 🚨 TROUBLESHOOTING: "AI Analysis Goes Back to Main Menu"
41
+
42
+ **Problem**: When you type "identify seasonal patterns", it returns to the main menu instead of processing.
43
+
44
+ **Root Cause**: Data not loaded first, or AI agent not properly configured.
45
+
46
+ **Solution Steps**:
47
+
48
+ 1. **Always Load Data First**:
49
+
50
+ ```
51
+ python upload.py
52
+ β†’ Choose 2 (Enhanced mode)
53
+ β†’ Choose 1 (Load data) ← CRITICAL STEP!
54
+ β†’ Wait for "DATA LOADED SUCCESSFULLY" message
55
+ β†’ Choose 4 (AI analysis)
56
+ β†’ Type your question
57
+ ```
58
+
59
+ 2. **Check AI Agent Status**:
60
+
61
+ - Look for "βœ… SmoLagent configured successfully" message
62
+ - If you see "❌ AI features not available", configure a model first
63
+
64
+ 3. **Alternative if AI Fails**:
65
+ ```bash
66
+ python fixed_upload.py # Has better error handling
67
+ python quick_ai_demo.py # Works without heavy downloads
68
+ ```
69
+
70
  ### Setting Up AI Features:
71
 
72
  #### For OpenAI (Recommended):
 
93
 
94
  ### Example AI Queries:
95
 
96
+ **For OutSystems Log Analysis** (once data is loaded and AI configured):
97
 
98
+ - "What are the main error patterns in this OutSystems data?"
99
+ - "Find modules with the highest error rates"
100
+ - "Analyze error trends over time"
101
+ - "Identify peak error periods"
102
  - "Suggest data quality improvements"
103
+ - "Find correlations between modules and error types"
104
+ - "Detect unusual activity patterns"
105
  - "Recommend preprocessing steps"
106
 
107
+ **Important**: Make sure to:
108
+
109
+ 1. βœ… Load data first (option 1)
110
+ 2. βœ… See "DATA LOADED SUCCESSFULLY" message
111
+ 3. βœ… See "SmoLagent configured" message
112
+ 4. βœ… Then use AI analysis (option 4)
113
+
114
  ### Features Available Without AI:
115
 
116
  Even without AI configuration, you get:
117
 
118
  - βœ… Data loading and exploration (original functionality)
119
+ - βœ… Statistical summaries and data overview
120
  - βœ… Data visualization (histograms, correlation heatmaps)
121
+ - βœ… Data quality analysis and missing value detection
122
+ - βœ… Interactive menu system for data exploration
123
+
124
+ ### Common Issues & Solutions:
125
+
126
+ #### 1. **"❌ No data loaded. Run load_data() first."**
127
+
128
+ **Fix**: Always choose option 1 (Load data) before option 4 (AI analysis)
129
+
130
+ #### 2. **"❌ AI features not available. Please configure a model first."**
131
+
132
+ **Fix**: Set up AI model using one of the methods below, or use `fixed_upload.py`
133
+
134
+ #### 3. **AI query returns to main menu**
135
+
136
+ **Fix**: Ensure data is loaded AND AI agent is configured successfully
137
+
138
+ #### 4. **Import errors (smolagents, duckduckgo-search)**
139
+
140
+ **Fix**: `pip install 'smolagents[transformers]' duckduckgo-search>=3.8.0`
141
+
142
+ #### 5. **Model download too slow**
143
+
144
+ **Fix**: Use `python quick_ai_demo.py` for lighter analysis
145
 
146
  ### Files Structure:
147
 
 
153
 
154
  ### Quick Start:
155
 
156
+ **CORRECT WORKFLOW** (to avoid menu issues):
157
+
158
+ 1. **Run the script**: `python upload.py`
159
+ 2. **Choose enhanced mode**: Select option 2
160
+ 3. **Load data FIRST**: Select option 1 and wait for completion
161
+ 4. **Verify setup**: Look for "βœ… SmoLagent configured" message
162
+ 5. **Use AI analysis**: Select option 4 and ask your question
163
+
164
+ **Quick Test Commands**:
165
+
166
+ ```bash
167
+ python test_smolagent.py # Test if SmoLagent is working
168
+ python fixed_upload.py # Alternative with better error handling
169
+ python quick_ai_demo.py # Quick demo without heavy downloads
170
+ ```
171
 
172
  πŸš€ **Your original functionality is preserved - nothing is broken!**
173
+
174
+ ### Performance Notes:
175
+
176
+ - **Data Loading**: ~2-5 seconds for 5000 rows
177
+ - **AI Setup**: ~10-30 seconds first time (model download)
178
+ - **AI Analysis**: ~5-15 seconds per query
179
+ - **File Size**: Works well with CSV files up to 100MB
__pycache__/upload.cpython-313.pyc CHANGED
Binary files a/__pycache__/upload.cpython-313.pyc and b/__pycache__/upload.cpython-313.pyc differ
 
fast_explorer.py ADDED
@@ -0,0 +1,187 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #!/usr/bin/env python3
2
+ """
3
+ Fast Enhanced Data Explorer (skips slow AI setup)
4
+ """
5
+ import pandas as pd
6
+ import os
7
+ import numpy as np
8
+ import matplotlib.pyplot as plt
9
+ import seaborn as sns
10
+ import warnings
11
+ warnings.filterwarnings('ignore')
12
+
13
+ # CSV file path
14
+ csv_file_path = "C:/Users/Cosmo/Desktop/NTU Peak Singtel/outsystems_sample_logs_6months.csv"
15
+
16
+ class FastDataExplorer:
17
+ """Fast data explorer with optional AI capabilities"""
18
+
19
+ def __init__(self, csv_path=csv_file_path):
20
+ self.csv_path = csv_path
21
+ self.df = None
22
+ self.agent = None
23
+ print("πŸš€ Fast Data Explorer initialized!")
24
+ print("πŸ’‘ AI features can be setup later via option 9")
25
+
26
+ def setup_ai_on_demand(self):
27
+ """Setup AI only when requested"""
28
+ if self.agent is not None:
29
+ print("βœ… AI already configured!")
30
+ return True
31
+
32
+ print("πŸ€– Setting up AI on demand...")
33
+ try:
34
+ # Try Ollama first
35
+ import ollama
36
+ models = ollama.list()
37
+ if models and 'models' in models and len(models['models']) > 0:
38
+ print("βœ… Ollama detected - configuring...")
39
+
40
+ # Simple Ollama wrapper
41
+ class SimpleOllama:
42
+ def run(self, prompt):
43
+ try:
44
+ response = ollama.generate(model='llama2', prompt=prompt)
45
+ return response['response']
46
+ except Exception as e:
47
+ return f"Error: {e}"
48
+
49
+ self.agent = SimpleOllama()
50
+ print("βœ… AI configured with Ollama!")
51
+ return True
52
+
53
+ except Exception as e:
54
+ print(f"⚠️ AI setup failed: {e}")
55
+
56
+ # Fallback: No AI
57
+ print("❌ AI not available - using manual analysis only")
58
+ return False
59
+
60
+ def load_data(self):
61
+ """Load the CSV data"""
62
+ print(f"\nπŸ“ Loading data from: {self.csv_path}")
63
+
64
+ try:
65
+ if not os.path.exists(self.csv_path):
66
+ print(f"❌ Error: File not found at {self.csv_path}")
67
+ return None
68
+
69
+ self.df = pd.read_csv(self.csv_path)
70
+
71
+ print("=== DATA LOADED SUCCESSFULLY ===")
72
+ print(f"πŸ“ File: {os.path.basename(self.csv_path)}")
73
+ print(f"πŸ“Š Dataset shape: {self.df.shape}")
74
+ print(f"πŸ“‹ Columns: {list(self.df.columns)}")
75
+ print("\n=== FIRST 5 ROWS ===")
76
+ print(self.df.head())
77
+
78
+ return self.df
79
+
80
+ except Exception as e:
81
+ print(f"Error loading data: {str(e)}")
82
+ return None
83
+
84
+ def quick_analysis(self):
85
+ """Quick manual analysis"""
86
+ if self.df is None:
87
+ print("❌ No data loaded.")
88
+ return
89
+
90
+ print("\n=== QUICK ANALYSIS ===")
91
+ print(f"πŸ“Š Shape: {self.df.shape}")
92
+ print(f"πŸ“‹ Columns: {list(self.df.columns)}")
93
+
94
+ # Log level analysis
95
+ if 'LogLevel' in self.df.columns:
96
+ print("\nπŸ“ˆ Log Level Distribution:")
97
+ print(self.df['LogLevel'].value_counts())
98
+
99
+ # Error analysis
100
+ if 'ErrorId' in self.df.columns:
101
+ error_count = self.df['ErrorId'].notna().sum()
102
+ print(f"\n🚨 Errors found: {error_count} out of {len(self.df)} records")
103
+
104
+ # Time analysis
105
+ if 'Timestamp' in self.df.columns:
106
+ print(f"\nπŸ“… Time range: {self.df['Timestamp'].min()} to {self.df['Timestamp'].max()}")
107
+
108
+ def ai_analysis(self, query):
109
+ """AI analysis with on-demand setup"""
110
+ if self.df is None:
111
+ print("❌ No data loaded.")
112
+ return
113
+
114
+ if self.agent is None:
115
+ print("πŸ€– Setting up AI...")
116
+ if not self.setup_ai_on_demand():
117
+ return
118
+
119
+ print(f"\nπŸ” Analyzing: {query}")
120
+
121
+ # Prepare simple data summary
122
+ data_summary = f"""
123
+ Data Analysis Request:
124
+ Dataset has {self.df.shape[0]} rows and {self.df.shape[1]} columns.
125
+ Columns: {list(self.df.columns)}
126
+
127
+ Sample data:
128
+ {self.df.head(2).to_string()}
129
+
130
+ Question: {query}
131
+
132
+ Please provide insights about this OutSystems log data.
133
+ """
134
+
135
+ try:
136
+ response = self.agent.run(data_summary)
137
+ print("\n" + "="*50)
138
+ print("πŸ€– AI ANALYSIS RESULT")
139
+ print("="*50)
140
+ print(response)
141
+ print("="*50)
142
+ except Exception as e:
143
+ print(f"❌ AI analysis failed: {e}")
144
+
145
+ def interactive_menu(self):
146
+ """Interactive menu"""
147
+ while True:
148
+ print("\n" + "="*40)
149
+ print("πŸš€ FAST DATA EXPLORER")
150
+ print("="*40)
151
+ print("1. Load data")
152
+ print("2. Quick analysis")
153
+ print("3. Show data summary")
154
+ print("4. AI analysis (auto-setup)")
155
+ print("5. Setup AI manually")
156
+ print("6. Exit")
157
+ print("="*40)
158
+
159
+ choice = input("Choice (1-6): ").strip()
160
+
161
+ if choice == '1':
162
+ self.load_data()
163
+ elif choice == '2':
164
+ self.quick_analysis()
165
+ elif choice == '3':
166
+ if self.df is not None:
167
+ print(f"\nπŸ“Š Summary: {self.df.shape[0]} rows, {self.df.shape[1]} columns")
168
+ print(f"πŸ“‹ Columns: {list(self.df.columns)}")
169
+ else:
170
+ print("❌ No data loaded.")
171
+ elif choice == '4':
172
+ query = input("πŸ’¬ Your question: ").strip()
173
+ if query:
174
+ self.ai_analysis(query)
175
+ else:
176
+ print("❌ No question entered.")
177
+ elif choice == '5':
178
+ self.setup_ai_on_demand()
179
+ elif choice == '6':
180
+ print("πŸ‘‹ Goodbye!")
181
+ break
182
+ else:
183
+ print("❌ Invalid choice.")
184
+
185
+ if __name__ == "__main__":
186
+ explorer = FastDataExplorer()
187
+ explorer.interactive_menu()
quick_test.py ADDED
@@ -0,0 +1,25 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #!/usr/bin/env python3
2
+ """
3
+ Quick test of Ollama with a simple prompt
4
+ """
5
+ import ollama
6
+
7
+ def quick_test():
8
+ print("πŸ” Quick Ollama test...")
9
+
10
+ try:
11
+ # Very simple test
12
+ response = ollama.generate(
13
+ model='llama2',
14
+ prompt='Say "Hello" in one word only.'
15
+ )
16
+
17
+ print(f"βœ… Response: {response['response']}")
18
+ return True
19
+
20
+ except Exception as e:
21
+ print(f"❌ Failed: {e}")
22
+ return False
23
+
24
+ if __name__ == "__main__":
25
+ quick_test()
run_enhanced.py ADDED
@@ -0,0 +1,16 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #!/usr/bin/env python3
2
+ """
3
+ Run Enhanced Data Explorer in interactive mode
4
+ """
5
+ from upload import EnhancedDataExplorer
6
+
7
+ if __name__ == "__main__":
8
+ print("πŸš€ Starting Enhanced Data Explorer with Ollama AI...")
9
+ explorer = EnhancedDataExplorer()
10
+
11
+ # Load data first
12
+ print("\nπŸ“ Loading data automatically...")
13
+ explorer.load_data()
14
+
15
+ # Start interactive menu
16
+ explorer.interactive_menu()
start_enhanced.py ADDED
@@ -0,0 +1,23 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #!/usr/bin/env python3
2
+ """
3
+ Direct launcher for Enhanced Data Explorer
4
+ """
5
+ from upload import EnhancedDataExplorer
6
+
7
+ print("πŸš€ Starting Enhanced Data Explorer with AI...")
8
+ print("πŸ”„ Initializing...")
9
+
10
+ try:
11
+ explorer = EnhancedDataExplorer()
12
+
13
+ print("\nπŸ“‹ System Status:")
14
+ explorer.check_status()
15
+
16
+ print("\n🎯 Starting interactive menu...")
17
+ explorer.interactive_menu()
18
+
19
+ except KeyboardInterrupt:
20
+ print("\nπŸ‘‹ Goodbye!")
21
+ except Exception as e:
22
+ print(f"\n❌ Error: {e}")
23
+ print("πŸ’‘ Try running: python upload.py")
test_enhanced.py ADDED
@@ -0,0 +1,44 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #!/usr/bin/env python3
2
+ """
3
+ Test the enhanced upload.py with Ollama integration
4
+ """
5
+ from upload import EnhancedDataExplorer
6
+
7
+ def test_enhanced_explorer():
8
+ print("πŸš€ Testing Enhanced Data Explorer with Ollama...")
9
+
10
+ try:
11
+ # Initialize the explorer
12
+ explorer = EnhancedDataExplorer()
13
+
14
+ # Check status
15
+ print("\nπŸ“‹ Checking system status:")
16
+ explorer.check_status()
17
+
18
+ # Load data
19
+ print("\nπŸ“ Loading data:")
20
+ data = explorer.load_data()
21
+
22
+ if data is not None:
23
+ print(f"βœ… Data loaded successfully: {data.shape}")
24
+
25
+ # Test AI analysis if agent is available
26
+ if explorer.agent is not None:
27
+ print("\nπŸ€– Testing AI analysis:")
28
+ response = explorer.ai_analysis("What are the main log levels in this data?")
29
+ if response:
30
+ print("βœ… AI analysis completed successfully!")
31
+ else:
32
+ print("⚠️ AI analysis returned no response")
33
+ else:
34
+ print("❌ No AI agent configured")
35
+ else:
36
+ print("❌ Failed to load data")
37
+
38
+ except Exception as e:
39
+ print(f"❌ Test failed: {e}")
40
+ import traceback
41
+ traceback.print_exc()
42
+
43
+ if __name__ == "__main__":
44
+ test_enhanced_explorer()
test_ollama.py ADDED
@@ -0,0 +1,37 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #!/usr/bin/env python3
2
+ """
3
+ Simple test to verify Ollama integration
4
+ """
5
+ import ollama
6
+
7
+ def test_ollama():
8
+ print("πŸ” Testing Ollama integration...")
9
+
10
+ try:
11
+ # Test connection
12
+ models = ollama.list()
13
+ print(f"βœ… Ollama is accessible! Found {len(models['models'])} models:")
14
+ for model in models['models']:
15
+ model_name = model.get('name', model.get('model', 'Unknown'))
16
+ print(f" πŸ“¦ {model_name}")
17
+
18
+ # Test generation
19
+ print("\nπŸ€– Testing AI generation...")
20
+ response = ollama.generate(
21
+ model='llama2',
22
+ prompt='Hello! Can you analyze data? Please respond briefly.'
23
+ )
24
+
25
+ print("βœ… AI Response:")
26
+ print("-" * 40)
27
+ print(response['response'])
28
+ print("-" * 40)
29
+
30
+ return True
31
+
32
+ except Exception as e:
33
+ print(f"❌ Ollama test failed: {e}")
34
+ return False
35
+
36
+ if __name__ == "__main__":
37
+ test_ollama()
test_upload_fixes.py ADDED
@@ -0,0 +1,54 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #!/usr/bin/env python3
2
+ """
3
+ Test script to verify upload.py fixes
4
+ """
5
+
6
+ def test_upload_fixes():
7
+ """Test that the upload.py fixes work correctly"""
8
+ print("πŸ§ͺ Testing upload.py fixes...")
9
+ print("="*50)
10
+
11
+ try:
12
+ # Test imports
13
+ import sys
14
+ import os
15
+ sys.path.append(os.path.dirname(os.path.abspath(__file__)))
16
+
17
+ from upload import EnhancedDataExplorer
18
+ print("βœ… Import successful")
19
+
20
+ # Test class initialization
21
+ explorer = EnhancedDataExplorer()
22
+ print("βœ… Class initialization successful")
23
+
24
+ # Test status check method
25
+ explorer.check_status()
26
+ print("βœ… Status check method works")
27
+
28
+ # Test data loading check
29
+ if explorer.df is None:
30
+ print("βœ… Data loading detection works (no data loaded yet)")
31
+ else:
32
+ print("βœ… Data loaded successfully")
33
+
34
+ # Test AI agent check
35
+ if explorer.agent is None:
36
+ print("⚠️ AI agent not configured (expected for testing)")
37
+ else:
38
+ print("βœ… AI agent configured successfully")
39
+
40
+ print("\nπŸŽ‰ All fixes appear to be working!")
41
+ print("πŸ’‘ The main issues have been resolved:")
42
+ print(" βœ… Data loading check before AI analysis")
43
+ print(" βœ… Better error messages and user guidance")
44
+ print(" βœ… Pause after AI analysis results")
45
+ print(" βœ… Status checking functionality")
46
+ print(" βœ… Improved model setup with fallbacks")
47
+
48
+ except Exception as e:
49
+ print(f"❌ Test failed: {e}")
50
+ import traceback
51
+ traceback.print_exc()
52
+
53
+ if __name__ == "__main__":
54
+ test_upload_fixes()
upload.py CHANGED
@@ -10,6 +10,16 @@ warnings.filterwarnings('ignore')
10
  # Replace 'your_file.csv' with your CSV file path
11
  csv_file_path = "C:/Users/Cosmo/Desktop/NTU Peak Singtel/outsystems_sample_logs_6months.csv"
12
 
 
 
 
 
 
 
 
 
 
 
13
  class EnhancedDataExplorer:
14
  """Enhanced data explorer with SmoLagent AI capabilities"""
15
 
@@ -17,50 +27,150 @@ class EnhancedDataExplorer:
17
  self.csv_path = csv_path
18
  self.df = None
19
  self.agent = None
20
- self.setup_agent()
 
 
21
 
22
  def setup_agent(self):
23
  """Setup SmoLagent AI agent with simple configuration"""
 
 
 
24
  try:
25
- print("πŸ€– Setting up SmoLagent with basic tools...")
26
-
27
- # Use the exact setup specified by user
28
  try:
29
- # Try with Ollama model first
30
- from smolagents import OllamaModel
31
- model = OllamaModel(model_id="llama2", base_url="http://localhost:11434")
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
32
  self.agent = CodeAgent(
33
  tools=[DuckDuckGoSearchTool()],
34
  model=model
35
  )
36
- print("βœ… SmoLagent configured successfully with Ollama and search capabilities")
 
37
  return
38
  except Exception as e:
39
  print(f"⚠️ Ollama setup failed: {e}")
 
40
 
41
- # Fallback to Transformers model
42
  try:
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
43
  from smolagents import TransformersModel
44
- model = TransformersModel(model_id="microsoft/DialoGPT-medium")
45
  self.agent = CodeAgent(
46
  tools=[DuckDuckGoSearchTool()],
47
  model=model
48
  )
49
- print("βœ… SmoLagent configured successfully with Transformers model")
 
50
  return
51
  except Exception as e:
52
- print(f"⚠️ Transformers setup failed: {e}")
53
- print(" Make sure all required packages are installed")
54
 
55
- if self.agent is None:
56
- print("\n❌ No AI agent could be configured.")
57
- print("πŸ“‹ To fix this:")
58
- print(" 1. Check internet connection")
59
- print(" 2. Install missing packages from requirements.txt")
60
- print("\nβœ… You can still use all non-AI features!")
 
 
61
 
62
  except Exception as e:
63
  print(f"⚠️ Agent setup failed: {e}")
 
64
  self.agent = None
65
 
66
  def configure_model_helper(self):
@@ -104,18 +214,22 @@ class EnhancedDataExplorer:
104
 
105
  def load_data(self):
106
  """Load the CSV data (keeping your original functionality)"""
 
 
107
  try:
108
  # Check if file exists
109
  if not os.path.exists(self.csv_path):
110
- print(f"Error: File not found at {self.csv_path}")
 
111
  return None
112
 
113
  # Read the CSV file into a DataFrame
114
  self.df = pd.read_csv(self.csv_path)
115
 
116
  print("=== DATA LOADED SUCCESSFULLY ===")
117
- print(f"Dataset shape: {self.df.shape}")
118
- print(f"Columns: {list(self.df.columns)}")
 
119
  print("\n=== FIRST 5 ROWS ===")
120
  print(self.df.head())
121
 
@@ -219,48 +333,110 @@ class EnhancedDataExplorer:
219
 
220
  def ai_analysis(self, query):
221
  """Use SmoLagent for AI-powered analysis"""
 
 
222
  if self.agent is None:
223
  print("❌ AI agent not configured. Please set up SmoLagent first.")
 
 
 
224
  return
225
 
226
  if self.df is None:
227
- print("❌ No data loaded. Run load_data() first.")
 
228
  return
229
 
230
- # Prepare context about the dataset
231
- data_context = f"""
232
- Dataset Analysis Request:
233
- - Dataset Shape: {self.df.shape}
234
- - Columns: {list(self.df.columns)}
235
- - Data Types: {dict(self.df.dtypes)}
236
- - Missing Values: {dict(self.df.isnull().sum())}
237
-
238
- Sample Data:
239
- {self.df.head(3).to_string()}
240
-
241
- Statistical Summary:
242
- {self.df.describe().to_string()}
243
-
244
- User Question: {query}
245
- """
246
 
 
247
  try:
248
- print(f"\n=== AI ANALYSIS FOR: '{query}' ===")
249
- print("πŸ€– Processing with SmoLagent...")
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
250
 
251
  # Use the agent with the data context and query
252
  response = self.agent.run(data_context)
253
- print("βœ… AI Analysis Complete:")
 
 
 
254
  print(response)
 
255
  return response
256
 
257
  except Exception as e:
258
- print(f"❌ AI analysis failed: {e}")
259
- print("πŸ’‘ Try using the data visualization and quality analysis features instead!")
 
 
 
260
  return None
261
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
262
  def interactive_menu(self):
263
  """Interactive menu for data exploration"""
 
 
 
264
  while True:
265
  print("\n" + "="*50)
266
  print("πŸ€– ENHANCED DATA EXPLORER WITH AI")
@@ -270,10 +446,13 @@ class EnhancedDataExplorer:
270
  print("3. Analyze data quality")
271
  print("4. AI-powered analysis")
272
  print("5. Show data summary")
273
- print("6. Exit")
 
 
274
  print("="*50)
 
275
 
276
- choice = input("Enter your choice (1-6): ").strip()
277
 
278
  if choice == '1':
279
  self.load_data()
@@ -282,23 +461,38 @@ class EnhancedDataExplorer:
282
  elif choice == '3':
283
  self.analyze_data_quality()
284
  elif choice == '4':
285
- if self.agent is None:
286
- print("\n❌ AI features not available. Please configure a model first.")
287
- print("Edit the setup_agent() method to add your API keys.")
288
- self.configure_model_helper()
289
  else:
290
- print("\nπŸ€– AI Analysis - Ask me anything about your data!")
291
- print("Example queries:")
292
- print(" β€’ 'What are the main trends in this data?'")
293
- print(" β€’ 'Find any outliers or anomalies'")
294
- print(" β€’ 'Suggest data quality improvements'")
295
- print(" β€’ 'Perform correlation analysis'")
296
- print(" β€’ 'Identify seasonal patterns'")
297
- print(" β€’ 'Recommend preprocessing steps'")
298
 
299
- query = input("\nπŸ’¬ Your question: ").strip()
300
- if query:
301
- self.ai_analysis(query)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
302
  elif choice == '5':
303
  if self.df is not None:
304
  print(f"\nπŸ“Š Dataset Summary:")
@@ -308,6 +502,10 @@ class EnhancedDataExplorer:
308
  else:
309
  print("❌ No data loaded.")
310
  elif choice == '6':
 
 
 
 
311
  print("πŸ‘‹ Goodbye!")
312
  break
313
  else:
@@ -315,18 +513,22 @@ class EnhancedDataExplorer:
315
 
316
  def load_and_explore_data():
317
  """Load and explore the CSV data (keeping your original function)"""
 
 
318
  try:
319
  # Check if file exists
320
  if not os.path.exists(csv_file_path):
321
- print(f"Error: File not found at {csv_file_path}")
 
322
  return None
323
 
324
  # Read the CSV file into a DataFrame
325
  df = pd.read_csv(csv_file_path)
326
 
327
  print("=== DATA LOADED SUCCESSFULLY ===")
328
- print(f"Dataset shape: {df.shape}")
329
- print(f"Columns: {list(df.columns)}")
 
330
  print("\n=== FIRST 5 ROWS ===")
331
  print(df.head())
332
 
 
10
  # Replace 'your_file.csv' with your CSV file path
11
  csv_file_path = "C:/Users/Cosmo/Desktop/NTU Peak Singtel/outsystems_sample_logs_6months.csv"
12
 
13
+ def set_csv_file_path(new_path):
14
+ """Update the CSV file path"""
15
+ global csv_file_path
16
+ csv_file_path = new_path
17
+ print(f"βœ… CSV file path updated to: {csv_file_path}")
18
+
19
+ def get_csv_file_path():
20
+ """Get the current CSV file path"""
21
+ return csv_file_path
22
+
23
  class EnhancedDataExplorer:
24
  """Enhanced data explorer with SmoLagent AI capabilities"""
25
 
 
27
  self.csv_path = csv_path
28
  self.df = None
29
  self.agent = None
30
+ print("πŸš€ Enhanced Data Explorer initialized!")
31
+ print("πŸ’‘ AI setup will be done when first needed (option 4)")
32
+ # Don't call setup_agent() here to avoid hanging
33
 
34
  def setup_agent(self):
35
  """Setup SmoLagent AI agent with simple configuration"""
36
+ print("πŸ€– Setting up SmoLagent AI agent...")
37
+ print("πŸ”„ Trying multiple model configurations...")
38
+
39
  try:
40
+ # Try with Ollama using direct ollama package (fast and local)
 
 
41
  try:
42
+ print("πŸ”„ Attempting Ollama setup...")
43
+ import ollama
44
+ # Quick test if Ollama is available (without generation test)
45
+ models = ollama.list()
46
+ if models and 'models' in models and len(models['models']) > 0:
47
+ print("βœ… Ollama is running and accessible!")
48
+ print(f"πŸ“¦ Found model: {models['models'][0].get('name', 'llama2')}")
49
+ else:
50
+ raise Exception("No models found")
51
+
52
+ # Create a custom model class for Ollama compatible with smolagents
53
+ class OllamaModel:
54
+ def __init__(self, model_name="llama2"):
55
+ self.model_name = model_name
56
+ import ollama
57
+ self.ollama = ollama
58
+
59
+ def __call__(self, messages, **kwargs):
60
+ try:
61
+ # Convert messages to Ollama format
62
+ if isinstance(messages, str):
63
+ prompt = messages
64
+ elif isinstance(messages, list):
65
+ # Handle different message formats
66
+ if len(messages) > 0 and isinstance(messages[0], dict):
67
+ # Extract content from message dictionaries
68
+ prompt = "\n".join([
69
+ msg.get('content', str(msg)) if isinstance(msg, dict) else str(msg)
70
+ for msg in messages
71
+ ])
72
+ else:
73
+ prompt = "\n".join([str(msg) for msg in messages])
74
+ else:
75
+ prompt = str(messages)
76
+
77
+ # Add timeout to prevent hanging
78
+ import signal
79
+ import time
80
+
81
+ def timeout_handler(signum, frame):
82
+ raise TimeoutError("Ollama response timeout")
83
+
84
+ # Set a 30-second timeout for Windows (using threading instead)
85
+ import threading
86
+ result = {'response': None, 'error': None}
87
+
88
+ def generate_with_timeout():
89
+ try:
90
+ response = self.ollama.generate(model=self.model_name, prompt=prompt)
91
+ result['response'] = response['response']
92
+ except Exception as e:
93
+ result['error'] = str(e)
94
+
95
+ thread = threading.Thread(target=generate_with_timeout)
96
+ thread.daemon = True
97
+ thread.start()
98
+ thread.join(timeout=30) # 30 second timeout
99
+
100
+ if thread.is_alive():
101
+ return "Error: Ollama response timed out after 30 seconds. Try a simpler query."
102
+ elif result['error']:
103
+ return f"Error generating response with Ollama: {result['error']}"
104
+ elif result['response']:
105
+ return result['response']
106
+ else:
107
+ return "Error: No response received from Ollama"
108
+
109
+ except Exception as e:
110
+ return f"Error generating response with Ollama: {e}"
111
+
112
+ def generate(self, messages, **kwargs):
113
+ """Alternative method name that might be expected"""
114
+ return self.__call__(messages, **kwargs)
115
+
116
+ model = OllamaModel("llama2")
117
  self.agent = CodeAgent(
118
  tools=[DuckDuckGoSearchTool()],
119
  model=model
120
  )
121
+ print("βœ… SmoLagent configured successfully with Ollama!")
122
+ print("πŸ’‘ Local AI model ready for analysis (with 30s timeout)")
123
  return
124
  except Exception as e:
125
  print(f"⚠️ Ollama setup failed: {e}")
126
+ print("πŸ’‘ Make sure Ollama is running: ollama serve")
127
 
128
+ # Try OpenAI if API key is available
129
  try:
130
+ print("πŸ”„ Checking for OpenAI API key...")
131
+ import os
132
+ from smolagents import OpenAIModel
133
+ if os.getenv('OPENAI_API_KEY'):
134
+ model = OpenAIModel(model_id="gpt-3.5-turbo")
135
+ self.agent = CodeAgent(
136
+ tools=[DuckDuckGoSearchTool()],
137
+ model=model
138
+ )
139
+ print("βœ… SmoLagent configured successfully with OpenAI!")
140
+ return
141
+ else:
142
+ print("⚠️ OpenAI API key not found")
143
+ except Exception as e:
144
+ print(f"⚠️ OpenAI setup failed: {e}")
145
+
146
+ # Fallback to Transformers model (smaller version)
147
+ try:
148
+ print("πŸ”„ Attempting HuggingFace Transformers model...")
149
  from smolagents import TransformersModel
150
+ model = TransformersModel(model_id="microsoft/DialoGPT-small") # Smaller model
151
  self.agent = CodeAgent(
152
  tools=[DuckDuckGoSearchTool()],
153
  model=model
154
  )
155
+ print("βœ… SmoLagent configured successfully with HuggingFace model!")
156
+ print("πŸ’‘ Note: First use may take time to download model")
157
  return
158
  except Exception as e:
159
+ print(f"⚠️ HuggingFace setup failed: {e}")
160
+ print(" Make sure transformers are installed: pip install 'smolagents[transformers]'")
161
 
162
+ # If all models fail
163
+ print("\n❌ No AI model could be configured.")
164
+ print("πŸ“‹ To fix this:")
165
+ print(" 1. For local AI: Install Ollama and run 'ollama serve'")
166
+ print(" 2. For OpenAI: Set OPENAI_API_KEY environment variable")
167
+ print(" 3. For basic use: pip install 'smolagents[transformers]'")
168
+ print("\nβœ… You can still use all non-AI features!")
169
+ self.agent = None
170
 
171
  except Exception as e:
172
  print(f"⚠️ Agent setup failed: {e}")
173
+ print("πŸ’‘ Try using: python fixed_upload.py")
174
  self.agent = None
175
 
176
  def configure_model_helper(self):
 
214
 
215
  def load_data(self):
216
  """Load the CSV data (keeping your original functionality)"""
217
+ print(f"\nπŸ“ Loading data from: {self.csv_path}")
218
+
219
  try:
220
  # Check if file exists
221
  if not os.path.exists(self.csv_path):
222
+ print(f"❌ Error: File not found at {self.csv_path}")
223
+ print("πŸ’‘ Use option 7 to change the file path")
224
  return None
225
 
226
  # Read the CSV file into a DataFrame
227
  self.df = pd.read_csv(self.csv_path)
228
 
229
  print("=== DATA LOADED SUCCESSFULLY ===")
230
+ print(f"πŸ“ File: {os.path.basename(self.csv_path)}")
231
+ print(f"πŸ“Š Dataset shape: {self.df.shape}")
232
+ print(f"πŸ“‹ Columns: {list(self.df.columns)}")
233
  print("\n=== FIRST 5 ROWS ===")
234
  print(self.df.head())
235
 
 
333
 
334
  def ai_analysis(self, query):
335
  """Use SmoLagent for AI-powered analysis"""
336
+ print(f"\nπŸ” Checking prerequisites for AI analysis...")
337
+
338
  if self.agent is None:
339
  print("❌ AI agent not configured. Please set up SmoLagent first.")
340
+ print("πŸ’‘ Try running one of these alternatives:")
341
+ print(" β€’ python fixed_upload.py")
342
+ print(" β€’ python quick_ai_demo.py")
343
  return
344
 
345
  if self.df is None:
346
+ print("❌ No data loaded. Please load data first!")
347
+ print("πŸ’‘ Choose option 1 in the main menu to load your data.")
348
  return
349
 
350
+ print("βœ… Data loaded successfully")
351
+ print("βœ… AI agent configured")
352
+ print(f"βœ… Processing query: '{query}'")
 
 
 
 
 
 
 
 
 
 
 
 
 
353
 
354
+ # Prepare context about the dataset
355
  try:
356
+ data_context = f"""
357
+ Dataset Analysis Request:
358
+ - Dataset Shape: {self.df.shape}
359
+ - Columns: {list(self.df.columns)}
360
+ - Data Types: {dict(self.df.dtypes)}
361
+ - Missing Values: {dict(self.df.isnull().sum())}
362
+
363
+ Sample Data:
364
+ {self.df.head(3).to_string()}
365
+
366
+ Statistical Summary:
367
+ {self.df.describe().to_string()}
368
+
369
+ User Question: {query}
370
+ """
371
+
372
+ print(f"\nπŸ€– SmoLagent is analyzing your data...")
373
+ print("⏳ This may take 5-15 seconds...")
374
 
375
  # Use the agent with the data context and query
376
  response = self.agent.run(data_context)
377
+
378
+ print("\n" + "="*60)
379
+ print("βœ… AI ANALYSIS COMPLETE")
380
+ print("="*60)
381
  print(response)
382
+ print("="*60)
383
  return response
384
 
385
  except Exception as e:
386
+ print(f"\n❌ AI analysis failed: {e}")
387
+ print("\nπŸ’‘ Troubleshooting suggestions:")
388
+ print(" β€’ Check your internet connection")
389
+ print(" β€’ Try: python fixed_upload.py")
390
+ print(" β€’ Use basic analysis features (options 2-3)")
391
  return None
392
 
393
+ def check_status(self):
394
+ """Check the status of data and AI setup"""
395
+ print("\nπŸ” SYSTEM STATUS CHECK")
396
+ print("="*40)
397
+
398
+ # Check file path
399
+ print(f"πŸ“ CSV File: {self.csv_path}")
400
+ if os.path.exists(self.csv_path):
401
+ print(f"βœ… File exists: {os.path.basename(self.csv_path)}")
402
+ else:
403
+ print(f"❌ File not found")
404
+
405
+ # Check data status
406
+ if self.df is not None:
407
+ print(f"βœ… Data loaded: {self.df.shape[0]} rows, {self.df.shape[1]} columns")
408
+ print(f"πŸ“‹ Columns: {list(self.df.columns)}")
409
+ else:
410
+ print("❌ No data loaded")
411
+
412
+ # Check AI agent status
413
+ if self.agent is not None:
414
+ print("βœ… AI agent configured and ready")
415
+ else:
416
+ print("❌ AI agent not configured")
417
+
418
+ print("="*40)
419
+
420
+ def change_csv_file(self, new_path=None):
421
+ """Change the CSV file path"""
422
+ if new_path is None:
423
+ print(f"\nπŸ“ Current file path: {self.csv_path}")
424
+ new_path = input("Enter new CSV file path: ").strip()
425
+
426
+ if os.path.exists(new_path):
427
+ self.csv_path = new_path
428
+ self.df = None # Clear current data
429
+ print(f"βœ… CSV file path updated to: {self.csv_path}")
430
+ print("πŸ’‘ Data cleared. Use option 1 to load the new file.")
431
+ else:
432
+ print(f"❌ File not found: {new_path}")
433
+ print("πŸ’‘ Please check the file path and try again.")
434
+
435
  def interactive_menu(self):
436
  """Interactive menu for data exploration"""
437
+ # Show initial status
438
+ self.check_status()
439
+
440
  while True:
441
  print("\n" + "="*50)
442
  print("πŸ€– ENHANCED DATA EXPLORER WITH AI")
 
446
  print("3. Analyze data quality")
447
  print("4. AI-powered analysis")
448
  print("5. Show data summary")
449
+ print("6. Check system status")
450
+ print("7. Change CSV file path")
451
+ print("8. Exit")
452
  print("="*50)
453
+ print(f"πŸ“ Current file: {os.path.basename(self.csv_path)}")
454
 
455
+ choice = input("Enter your choice (1-8): ").strip()
456
 
457
  if choice == '1':
458
  self.load_data()
 
461
  elif choice == '3':
462
  self.analyze_data_quality()
463
  elif choice == '4':
464
+ if self.df is None:
465
+ print("\n❌ No data loaded. Please load data first!")
466
+ print("πŸ’‘ Choose option 1 to load your data before using AI analysis.")
467
+ input("\nPress Enter to continue...")
468
  else:
469
+ # Setup AI on demand if not already done
470
+ if self.agent is None:
471
+ print("\nπŸ€– Setting up AI for first use...")
472
+ self.setup_agent()
 
 
 
 
473
 
474
+ if self.agent is None:
475
+ print("\n❌ AI features not available. Please configure a model first.")
476
+ print("Edit the setup_agent() method to add your API keys.")
477
+ self.configure_model_helper()
478
+ else:
479
+ print("\nπŸ€– AI Analysis - Ask me anything about your data!")
480
+ print("Example queries:")
481
+ print(" β€’ 'What are the main trends in this data?'")
482
+ print(" β€’ 'Find any outliers or anomalies'")
483
+ print(" β€’ 'Suggest data quality improvements'")
484
+ print(" β€’ 'Perform correlation analysis'")
485
+ print(" β€’ 'Identify seasonal patterns'")
486
+ print(" β€’ 'Recommend preprocessing steps'")
487
+
488
+ query = input("\nπŸ’¬ Your question: ").strip()
489
+ if query:
490
+ self.ai_analysis(query)
491
+ # Wait for user to read the results before returning to menu
492
+ input("\nπŸ“‹ Press Enter to return to main menu...")
493
+ else:
494
+ print("❌ No question entered.")
495
+ input("\nPress Enter to continue...")
496
  elif choice == '5':
497
  if self.df is not None:
498
  print(f"\nπŸ“Š Dataset Summary:")
 
502
  else:
503
  print("❌ No data loaded.")
504
  elif choice == '6':
505
+ self.check_status()
506
+ elif choice == '7':
507
+ self.change_csv_file()
508
+ elif choice == '8':
509
  print("πŸ‘‹ Goodbye!")
510
  break
511
  else:
 
513
 
514
  def load_and_explore_data():
515
  """Load and explore the CSV data (keeping your original function)"""
516
+ print(f"\nπŸ“ Loading data from: {csv_file_path}")
517
+
518
  try:
519
  # Check if file exists
520
  if not os.path.exists(csv_file_path):
521
+ print(f"❌ Error: File not found at {csv_file_path}")
522
+ print("πŸ’‘ Update the csv_file_path variable at the top of this file")
523
  return None
524
 
525
  # Read the CSV file into a DataFrame
526
  df = pd.read_csv(csv_file_path)
527
 
528
  print("=== DATA LOADED SUCCESSFULLY ===")
529
+ print(f"πŸ“ File: {os.path.basename(csv_file_path)}")
530
+ print(f"πŸ“Š Dataset shape: {df.shape}")
531
+ print(f"πŸ“‹ Columns: {list(df.columns)}")
532
  print("\n=== FIRST 5 ROWS ===")
533
  print(df.head())
534