Ajey95 commited on
Commit
4b88321
·
1 Parent(s): 6b7860b

commit final files

Browse files
Files changed (7) hide show
  1. Projectstructure.md +127 -0
  2. README.md +257 -8
  3. deploy.sh +69 -0
  4. enhanced_agents.py +1183 -0
  5. modal_app.py +218 -0
  6. requirements.txt +18 -0
  7. research_copilot.py +911 -0
Projectstructure.md ADDED
@@ -0,0 +1,127 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # 🤖 ResearchCopilot - Complete Project Structure
2
+
3
+ ## 📁 File Organization
4
+
5
+ ```
6
+ research-copilot/
7
+ ├── 📄 research_copilot.py # Main Gradio application with complete UI
8
+ ├── ⚙️ modal_app.py # Modal deployment configuration
9
+ ├── 🔧 enhanced_agents.py # Production agents with real API integrations
10
+ ├── 📋 requirements.txt # All Python dependencies
11
+ ├── 🔐 .env.example # Environment variables template
12
+ ├── 🚀 deploy.sh # Automated deployment script
13
+ ├── 📖 README.md # Comprehensive documentation
14
+ └── 📝 Project_Structure.md # This file
15
+ ```
16
+
17
+ ## 🎯 Key Components
18
+
19
+ ### 1. Core Application (`research_copilot.py`)
20
+ - **Multi-Agent System**: 4 specialized agents working together
21
+ - **Gradio Interface**: Beautiful, responsive UI with real-time updates
22
+ - **Agent Orchestration**: Sophisticated workflow management
23
+ - **Progress Tracking**: Live updates during research process
24
+ - **Results Display**: Tabbed interface for different output types
25
+
26
+ ### 2. Modal Deployment (`modal_app.py`)
27
+ - **Serverless Architecture**: Scalable cloud deployment
28
+ - **API Integrations**: Real Perplexity, Google, and Claude APIs
29
+ - **Secret Management**: Secure API key handling
30
+ - **Environment Setup**: Automated dependency management
31
+
32
+ ### 3. Enhanced Agents (`enhanced_agents.py`)
33
+ - **Production-Ready**: Real API integrations with fallbacks
34
+ - **Error Handling**: Comprehensive error management
35
+ - **Mock Data**: Realistic demo data when APIs unavailable
36
+ - **Multiple Formats**: Citations in APA, MLA, Chicago, IEEE, Harvard
37
+
38
+ ### 4. Deployment Tools
39
+ - **Automated Setup**: One-command deployment script
40
+ - **Environment Management**: Easy API key configuration
41
+ - **Monitoring**: Built-in logging and status tracking
42
+
43
+ ## 🚀 Quick Start Guide
44
+
45
+ ### Option 1: Local Development
46
+ ```bash
47
+ git clone <your-repo>
48
+ cd research-copilot
49
+ pip install -r requirements.txt
50
+ cp .env.example .env
51
+ # Edit .env with your API keys
52
+ python research_copilot.py
53
+ ```
54
+
55
+ ### Option 2: Modal Deployment
56
+ ```bash
57
+ chmod +x deploy.sh
58
+ ./deploy.sh
59
+ # Follow the prompts for API keys
60
+ ```
61
+
62
+ ## 🏆 Hackathon Submission Checklist
63
+
64
+ - ✅ **Complete Multi-Agent System**: 4 specialized agents with real collaboration
65
+ - ✅ **Production-Ready Code**: Real API integrations and error handling
66
+ - ✅ **Beautiful UI**: Professional Gradio interface with progress tracking
67
+ - ✅ **Scalable Deployment**: Modal serverless architecture
68
+ - ✅ **Comprehensive Documentation**: Detailed README and setup guides
69
+ - ✅ **Demo-Ready**: Works with and without API keys (mock data)
70
+ - ✅ **Track 3 Focus**: Perfect showcase of agentic AI capabilities
71
+
72
+ ## 🎬 Demo Script
73
+
74
+ ### 1. Introduction (30 seconds)
75
+ "Welcome to ResearchCopilot - a multi-agent AI system that demonstrates the power of collaborative AI agents working together to conduct comprehensive research."
76
+
77
+ ### 2. Agent Overview (45 seconds)
78
+ "Our system features four specialized agents: the Planner breaks down queries, the Retriever searches multiple sources, the Summarizer analyzes information, and the Citation agent ensures academic rigor."
79
+
80
+ ### 3. Live Demonstration (90 seconds)
81
+ - Enter query: "Latest developments in quantum computing for drug discovery"
82
+ - Show real-time agent activity
83
+ - Highlight agent collaboration and decision-making
84
+ - Display comprehensive results across all tabs
85
+
86
+ ### 4. Technical Highlights (30 seconds)
87
+ "Built with Gradio for the interface, deployed on Modal for scalability, and featuring real API integrations with Perplexity, Google, and Claude."
88
+
89
+ ### 5. Conclusion (15 seconds)
90
+ "ResearchCopilot represents the future of AI-powered research through intelligent agent collaboration."
91
+
92
+ ## 📊 Performance Metrics
93
+
94
+ ### System Capabilities
95
+ - **Query Processing**: Natural language understanding
96
+ - **Source Diversity**: Academic, news, web, and expert sources
97
+ - **Citation Quality**: 5 academic formats (APA, MLA, Chicago, IEEE, Harvard)
98
+ - **Real-time Updates**: Live progress tracking and agent communication
99
+ - **Scalability**: Handles concurrent users via Modal deployment
100
+
101
+ ### Technical Specifications
102
+ - **Response Time**: 30-60 seconds for comprehensive research
103
+ - **Source Coverage**: 10-20 sources per query
104
+ - **Agent Coordination**: Asynchronous task execution
105
+ - **Error Resilience**: Graceful fallbacks and mock data
106
+ - **API Integration**: 3+ real-time data sources
107
+
108
+ ## 🌟 Unique Selling Points
109
+
110
+ 1. **True Multi-Agent Collaboration**: Agents actually communicate and build on each other's work
111
+ 2. **Adaptive Planning**: Research strategy adjusts based on query complexity
112
+ 3. **Production-Grade**: Real API integrations with comprehensive error handling
113
+ 4. **Academic Quality**: Professional citation generation in multiple formats
114
+ 5. **Scalable Architecture**: Ready for real-world deployment
115
+ 6. **Beautiful UX**: Intuitive interface that showcases agent intelligence
116
+
117
+ ## 🎯 Next Steps for Submission
118
+
119
+ 1. **Create Demo Video**: Record 2-3 minute demonstration
120
+ 2. **Deploy to Modal**: Use the provided deployment script
121
+ 3. **Update README**: Add live demo URL and video link
122
+ 4. **Submit to Organization**: Push to Agents-MCP-Hackathon Space
123
+ 5. **Share on Social**: Use #GradioMCPHackathon hashtag
124
+
125
+ ---
126
+
127
+ **This project represents a complete, production-ready multi-agent research system perfect for Track 3: Agentic Demo Showcase. It demonstrates sophisticated AI agent collaboration while solving real research problems.**
README.md CHANGED
@@ -1,14 +1,263 @@
 
1
  ---
2
- title: ResearchCopilot
3
- emoji: 👀
4
  colorFrom: indigo
5
- colorTo: red
6
  sdk: gradio
7
- sdk_version: 5.33.1
8
- app_file: app.py
9
- pinned: false
10
  license: mit
11
- short_description: 'Multi-agent AI research system '
 
 
 
 
 
 
 
 
 
 
12
  ---
13
 
14
- Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ yaml
2
  ---
3
+ title: 🤖 ResearchCopilot
4
+ emoji: 🔬
5
  colorFrom: indigo
6
+ colorTo: purple
7
  sdk: gradio
8
+ sdk_version: 4.44.0
9
+ app_file: research_copilot.py
10
+ pinned: true
11
  license: mit
12
+ short_description: Multi-agent AI research system with real-time search & analysis 🚀
13
+ tags:
14
+ - agentic-demo-track
15
+ - multi-agent
16
+ - research
17
+ - perplexity
18
+ - claude
19
+ - openai
20
+ video_overview: https://www.youtube.com/watch?v=YOUR_VIDEO_ID_HERE
21
+ collection: >-
22
+ https://huggingface.co/collections/Agents-MCP-Hackathon
23
  ---
24
 
25
+ # 🤖 ResearchCopilot - Multi-Agent Research System
26
+
27
+ **Track 3: Agentic Demo Showcase - Gradio MCP Hackathon 2025**
28
+
29
+ A sophisticated multi-agent AI system that demonstrates the power of collaborative AI agents working together to conduct comprehensive research. ResearchCopilot breaks down complex research queries into structured tasks and employs specialized agents to gather, analyze, and synthesize information from multiple sources.
30
+
31
+ ## 🎯 Demo Video
32
+ [Link to video demonstration will be added here]
33
+
34
+ ## 🚀 Features
35
+
36
+ ### Multi-Agent Architecture
37
+ - **🎯 Planner Agent**: Intelligently breaks down research queries into structured, prioritized tasks
38
+ - **🔍 Retriever Agent**: Searches multiple sources (Perplexity API, Google Search, Academic databases)
39
+ - **📝 Summarizer Agent**: Analyzes and synthesizes information using Claude/GPT models
40
+ - **📚 Citation Agent**: Generates proper academic citations in multiple formats (APA, MLA, Chicago, IEEE, Harvard)
41
+
42
+ ### Key Capabilities
43
+ - Real-time collaborative agent orchestration
44
+ - Adaptive research planning based on query complexity
45
+ - Cross-agent learning and decision making
46
+ - Parallel task execution for efficient research
47
+ - Professional citation generation
48
+ - Comprehensive research documentation
49
+
50
+ ### Technical Highlights
51
+ - Built with Gradio for intuitive user experience
52
+ - Deployed on Modal for scalable serverless execution
53
+ - Asynchronous agent communication
54
+ - Real API integrations (Perplexity, Google, Anthropic)
55
+ - Comprehensive error handling and fallbacks
56
+
57
+ ## 🏗️ System Architecture
58
+
59
+ ```
60
+ ┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
61
+ │ User Query │───▶│ Orchestrator │───▶│ Results UI │
62
+ └─────────────────┘ └─────────────────┘ └─────────────────┘
63
+
64
+ ┌──────────────┼──────────────┐
65
+ │ │ │
66
+ ┌───────▼───────┐ ┌────▼──────┐ ┌────▼──────┐
67
+ │ Planner Agent │ │ Retriever │ │Summarizer │
68
+ └───────────────┘ │ Agent │ │ Agent │
69
+ └───────────┘ └───────────┘
70
+ │ │
71
+ ┌──────▼──────┐ ┌───▼────────┐
72
+ │ APIs │ │ Citation │
73
+ │ Perplexity │ │ Agent │
74
+ │ Google │ └────────────┘
75
+ │ Academic │
76
+ └─────────────┘
77
+ ```
78
+
79
+ ## 🛠️ Installation & Setup
80
+
81
+ ### Local Development
82
+
83
+ 1. **Clone and Install Dependencies**
84
+ ```bash
85
+ git clone <repository-url>
86
+ cd research-copilot
87
+ pip install -r requirements.txt
88
+ ```
89
+
90
+ 2. **Environment Configuration**
91
+ ```bash
92
+ cp .env.example .env
93
+ # Edit .env with your API keys
94
+ ```
95
+
96
+ 3. **Run Locally**
97
+ ```bash
98
+ python research_copilot.py
99
+ ```
100
+
101
+ ### Modal Deployment
102
+
103
+ 1. **Install Modal**
104
+ ```bash
105
+ pip install modal
106
+ modal setup
107
+ ```
108
+
109
+ 2. **Configure Secrets**
110
+ ```bash
111
+ modal secret create research-copilot-secrets \
112
+ PERPLEXITY_API_KEY=your_key \
113
+ GOOGLE_API_KEY=your_key \
114
+ GOOGLE_SEARCH_ENGINE_ID=your_id \
115
+ ANTHROPIC_API_KEY=your_key
116
+ ```
117
+
118
+ 3. **Deploy to Modal**
119
+ ```bash
120
+ modal deploy modal_app.py
121
+ ```
122
+
123
+ ## 🔧 API Keys Required
124
+
125
+ ### Required for Full Functionality
126
+ - **Perplexity API**: Real-time search capabilities
127
+ - **Google Custom Search API**: Web search functionality
128
+ - **Anthropic Claude API**: Advanced summarization
129
+
130
+ ### Optional
131
+ - **OpenAI API**: Alternative summarization
132
+ - **Additional APIs**: ArXiv, CrossRef for academic sources
133
+
134
+ *Note: The system includes comprehensive mock data for demonstration without API keys*
135
+
136
+ ## 💡 Usage Examples
137
+
138
+ ### Basic Research Query
139
+ ```
140
+ "Latest developments in quantum computing for drug discovery"
141
+ ```
142
+
143
+ ### Comparative Analysis
144
+ ```
145
+ "Compare renewable energy adoption in Europe vs Asia 2024"
146
+ ```
147
+
148
+ ### Academic Research
149
+ ```
150
+ "Recent peer-reviewed studies on AI bias in healthcare diagnostics"
151
+ ```
152
+
153
+ ### Technical Analysis
154
+ ```
155
+ "How does blockchain technology improve supply chain transparency"
156
+ ```
157
+
158
+ ## 🎨 User Interface
159
+
160
+ The Gradio interface provides:
161
+ - **Interactive Research Input**: Natural language query processing with example prompts
162
+ - **Real-time Agent Activity**: Live visualization of agent collaboration and decision-making
163
+ - **Tabbed Results Display**:
164
+ - 📊 Summary: Comprehensive research synthesis with key findings
165
+ - 📚 Sources: Detailed source analysis with relevance scoring
166
+ - 📖 Citations: Multi-format academic citations (APA, MLA, Chicago, IEEE, Harvard)
167
+ - 🔍 Process Log: Complete agent activity timeline and reasoning
168
+ - **Progress Tracking**: Real-time progress indicators for each research phase
169
+ - **Responsive Design**: Works seamlessly across desktop and mobile devices
170
+
171
+ ## 🏆 Hackathon Submission - Track 3
172
+
173
+ ### Innovation Highlights
174
+ - **Multi-Agent Orchestration**: Demonstrates sophisticated AI agent collaboration
175
+ - **Adaptive Intelligence**: Agents learn from each other and adjust strategies dynamically
176
+ - **Real-world Integration**: Production-ready with actual API integrations
177
+ - **Scalable Architecture**: Built for real-world deployment and usage
178
+
179
+ ### Demo Scenarios
180
+ 1. **Academic Research**: "Climate change impact on Arctic biodiversity"
181
+ 2. **Technology Analysis**: "Comparison of LLM architectures for code generation"
182
+ 3. **Market Research**: "Sustainable packaging trends in food industry 2025"
183
+ 4. **Policy Analysis**: "AI regulation frameworks across major economies"
184
+
185
+ ## 📁 Project Structure
186
+
187
+ ```
188
+ research-copilot/
189
+ ├── research_copilot.py # Main app with full UI and agent system
190
+ ├── modal_app.py # Modal deployment configuration
191
+ ├── enhanced_agents.py # Production agents with API integrations
192
+ ├── requirements.txt # All dependencies
193
+ ├── .env.example # API key template
194
+ ├── deploy.sh # One-command deployment
195
+ ├── README.md # Comprehensive documentation
196
+ └── Project_Structure.md # This summary
197
+ ```
198
+
199
+ ## 🧪 Testing
200
+
201
+ ```bash
202
+ # Run agent tests
203
+ python -m pytest tests/test_agents.py -v
204
+
205
+ # Run integration tests
206
+ python -m pytest tests/test_integration.py -v
207
+
208
+ # Run UI tests
209
+ python -m pytest tests/test_ui.py -v
210
+ ```
211
+
212
+ ## 🔮 Future Enhancements
213
+
214
+ ### Planned Features
215
+ - **Voice Interface**: Natural language voice queries and responses
216
+ - **Research Templates**: Pre-configured workflows for different research types
217
+ - **Collaborative Research**: Multi-user research sessions with shared workspaces
218
+ - **Export Options**: PDF reports, Word documents, presentation slides
219
+ - **Advanced Analytics**: Research quality metrics and bias detection
220
+ - **Custom Agent Training**: User-specific agent customization and learning
221
+
222
+ ### API Integrations Roadmap
223
+ - **ArXiv**: Academic paper search and analysis
224
+ - **PubMed**: Medical and life sciences research
225
+ - **CrossRef**: DOI resolution and metadata
226
+ - **Semantic Scholar**: AI-powered academic search
227
+ - **News APIs**: Real-time news aggregation
228
+ - **Social Media**: Trend analysis and public sentiment
229
+
230
+ ## 🤝 Contributing
231
+
232
+ We welcome contributions! Please see our contributing guidelines:
233
+
234
+ 1. Fork the repository
235
+ 2. Create a feature branch (`git checkout -b feature/amazing-feature`)
236
+ 3. Commit your changes (`git commit -m 'Add amazing feature'`)
237
+ 4. Push to the branch (`git push origin feature/amazing-feature`)
238
+ 5. Open a Pull Request
239
+
240
+ ## 📄 License
241
+
242
+ This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.
243
+
244
+ ## 🙏 Acknowledgments
245
+
246
+ - **Gradio Team**: For the amazing interface framework
247
+ - **Modal**: For serverless deployment platform
248
+ - **Anthropic**: For Claude API integration
249
+ - **Perplexity**: For real-time search capabilities
250
+ - **Hackathon Organizers**: For the opportunity to showcase multi-agent AI
251
+
252
+ ## 📞 Contact
253
+
254
+ - **Team**: ResearchCopilot Development Team
255
+ - **Email**: [email protected]
256
+ - **Demo**: [Link to live demo]
257
+ - **Video**: [Link to demonstration video]
258
+
259
+ ---
260
+
261
+ **Built for the Gradio Agents & MCP Hackathon 2025 - Track 3: Agentic Demo Showcase**
262
+
263
+ *Demonstrating the future of AI-powered research through intelligent agent collaboration*
deploy.sh ADDED
@@ -0,0 +1,69 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #!/bin/bash
2
+
3
+ # ResearchCopilot Deployment Script
4
+ # Gradio MCP Hackathon 2025 - Track 3
5
+
6
+ echo "🤖 ResearchCopilot Deployment Script"
7
+ echo "===================================="
8
+
9
+ # Check if Modal is installed
10
+ if ! command -v modal &> /dev/null; then
11
+ echo "❌ Modal CLI not found. Installing..."
12
+ pip install modal
13
+ echo "✅ Modal installed"
14
+ fi
15
+
16
+ # Check if user is authenticated with Modal
17
+ if ! modal token list &> /dev/null; then
18
+ echo "🔐 Setting up Modal authentication..."
19
+ modal setup
20
+ fi
21
+
22
+ # Create Modal secrets if they don't exist
23
+ echo "🔧 Setting up Modal secrets..."
24
+
25
+ # Check if secrets exist
26
+ if modal secret list | grep -q "research-copilot-secrets"; then
27
+ echo "✅ Secrets already exist"
28
+ else
29
+ echo "📝 Creating new secrets..."
30
+ echo "Please enter your API keys (press Enter to skip):"
31
+
32
+ read -p "Perplexity API Key: " PERPLEXITY_KEY
33
+ read -p "Google API Key: " GOOGLE_KEY
34
+ read -p "Google Search Engine ID: " GOOGLE_ENGINE_ID
35
+ read -p "Anthropic API Key: " ANTHROPIC_KEY
36
+ read -p "OpenAI API Key (optional): " OPENAI_KEY
37
+
38
+ # Create the secret
39
+ modal secret create research-copilot-secrets \
40
+ ${PERPLEXITY_KEY:+PERPLEXITY_API_KEY="$PERPLEXITY_KEY"} \
41
+ ${GOOGLE_KEY:+GOOGLE_API_KEY="$GOOGLE_KEY"} \
42
+ ${GOOGLE_ENGINE_ID:+GOOGLE_SEARCH_ENGINE_ID="$GOOGLE_ENGINE_ID"} \
43
+ ${ANTHROPIC_KEY:+ANTHROPIC_API_KEY="$ANTHROPIC_KEY"} \
44
+ ${OPENAI_KEY:+OPENAI_API_KEY="$OPENAI_KEY"}
45
+
46
+ echo "✅ Secrets created successfully"
47
+ fi
48
+
49
+ # Deploy to Modal
50
+ echo "🚀 Deploying ResearchCopilot to Modal..."
51
+ modal deploy modal_app.py
52
+
53
+ if [ $? -eq 0 ]; then
54
+ echo "✅ Deployment successful!"
55
+ echo ""
56
+ echo "🎉 ResearchCopilot is now live!"
57
+ echo "📱 Your app will be available at the URL provided by Modal"
58
+ echo "📊 Monitor your app: modal app list"
59
+ echo "📝 View logs: modal app logs research-copilot"
60
+ echo ""
61
+ echo "🏆 Ready for Hackathon submission!"
62
+ echo "📋 Don't forget to:"
63
+ echo " 1. Create your demo video"
64
+ echo " 2. Update README with live demo URL"
65
+ echo " 3. Submit to Agents-MCP-Hackathon organization"
66
+ else
67
+ echo "❌ Deployment failed. Check the logs above for details."
68
+ exit 1
69
+ fi
enhanced_agents.py ADDED
@@ -0,0 +1,1183 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # enhanced_agents.py - FIXED VERSION - Production-ready agents with real API integrations
2
+
3
+ import asyncio
4
+ import aiohttp
5
+ import json
6
+ import os
7
+ import requests # Added for fallback HTTP requests
8
+ from typing import Dict, List, Optional
9
+ from datetime import datetime
10
+ import logging
11
+ from dataclasses import dataclass
12
+
13
+ logger = logging.getLogger(__name__)
14
+
15
+ @dataclass
16
+ class SearchResult:
17
+ title: str
18
+ url: str
19
+ snippet: str
20
+ source_type: str
21
+ relevance: float = 0.0
22
+ timestamp: str = None
23
+
24
+ def __post_init__(self):
25
+ if self.timestamp is None:
26
+ self.timestamp = datetime.now().isoformat()
27
+
28
+ class EnhancedRetrieverAgent:
29
+ """Production retriever with real API integrations"""
30
+
31
+ def __init__(self):
32
+ self.perplexity_api_key = os.getenv("PERPLEXITY_API_KEY")
33
+ self.google_api_key = os.getenv("GOOGLE_API_KEY")
34
+ self.google_search_engine_id = os.getenv("GOOGLE_SEARCH_ENGINE_ID")
35
+ self.session = None
36
+
37
+ async def __aenter__(self):
38
+ # Create session with SSL configuration for better connectivity
39
+ connector = aiohttp.TCPConnector(
40
+ ssl=False, # Disable SSL verification if having issues
41
+ limit=10
42
+ )
43
+ self.session = aiohttp.ClientSession(
44
+ connector=connector,
45
+ headers={'User-Agent': 'ResearchCopilot/1.0'},
46
+ timeout=aiohttp.ClientTimeout(total=30)
47
+ )
48
+ return self
49
+
50
+ async def __aexit__(self, exc_type, exc_val, exc_tb):
51
+ if self.session:
52
+ await self.session.close()
53
+
54
+ async def search_perplexity(self, query: str, num_results: int = 5) -> List[SearchResult]:
55
+ """Search using Perplexity API for real-time information"""
56
+ if not self.perplexity_api_key:
57
+ logger.warning("No Perplexity API key found, using mock data")
58
+ return self._get_mock_results(query, "perplexity")
59
+
60
+ try:
61
+ headers = {
62
+ "Authorization": f"Bearer {self.perplexity_api_key}",
63
+ "Content-Type": "application/json"
64
+ }
65
+
66
+ payload = {
67
+ "model": "llama-3.1-sonar-small-128k-online",
68
+ "messages": [
69
+ {
70
+ "role": "user",
71
+ "content": f"Research this topic and provide sources: {query}"
72
+ }
73
+ ],
74
+ "max_tokens": 1000,
75
+ "temperature": 0.2
76
+ }
77
+
78
+ async with self.session.post(
79
+ "https://api.perplexity.ai/chat/completions",
80
+ headers=headers,
81
+ json=payload,
82
+ timeout=30
83
+ ) as response:
84
+
85
+ if response.status == 200:
86
+ data = await response.json()
87
+ logger.info(f"Perplexity API response received: {response.status}")
88
+
89
+ # Handle different response formats
90
+ choices = data.get("choices", [])
91
+ if not choices:
92
+ logger.warning("No choices in Perplexity response")
93
+ return self._get_mock_results(query, "perplexity")
94
+
95
+ message = choices[0].get("message", {})
96
+ content = message.get("content", "") if isinstance(message, dict) else str(message)
97
+
98
+ # Always create at least one result from the content
99
+ results = []
100
+ if content and len(content.strip()) > 10:
101
+ # Split content into multiple sources if it's long
102
+ content_parts = content.split('\n\n')[:num_results]
103
+
104
+ for i, part in enumerate(content_parts):
105
+ if part.strip():
106
+ results.append(SearchResult(
107
+ title=f"Perplexity Research: {query} - Insight {i+1}",
108
+ url=f"https://perplexity.ai/search?q={query.replace(' ', '+')}",
109
+ snippet=part.strip()[:300] + "..." if len(part.strip()) > 300 else part.strip(),
110
+ source_type="perplexity",
111
+ relevance=0.95 - (i * 0.05)
112
+ ))
113
+
114
+ # If no content, create a default result
115
+ if not results:
116
+ results.append(SearchResult(
117
+ title=f"Perplexity Research: {query}",
118
+ url=f"https://perplexity.ai/search?q={query.replace(' ', '+')}",
119
+ snippet=f"Research findings on {query} from Perplexity AI analysis.",
120
+ source_type="perplexity",
121
+ relevance=0.9
122
+ ))
123
+
124
+ logger.info(f"Successfully retrieved {len(results)} results from Perplexity")
125
+ return results
126
+
127
+ else:
128
+ logger.error(f"Perplexity API error: {response.status}")
129
+ error_text = await response.text()
130
+ logger.error(f"Perplexity error details: {error_text}")
131
+ return self._get_mock_results(query, "perplexity")
132
+
133
+ except Exception as e:
134
+ logger.error(f"Perplexity search failed: {str(e)}")
135
+ return self._get_mock_results(query, "perplexity")
136
+
137
+ async def search_google(self, query: str, num_results: int = 10) -> List[SearchResult]:
138
+ """Search using Google Custom Search API"""
139
+ if not self.google_api_key or not self.google_search_engine_id:
140
+ logger.warning("No Google API credentials found, using mock data")
141
+ return self._get_mock_results(query, "google")
142
+
143
+ try:
144
+ params = {
145
+ "key": self.google_api_key,
146
+ "cx": self.google_search_engine_id,
147
+ "q": query,
148
+ "num": min(num_results, 10)
149
+ }
150
+
151
+ async with self.session.get(
152
+ "https://www.googleapis.com/customsearch/v1",
153
+ params=params
154
+ ) as response:
155
+
156
+ if response.status == 200:
157
+ data = await response.json()
158
+ results = []
159
+
160
+ for i, item in enumerate(data.get("items", [])):
161
+ results.append(SearchResult(
162
+ title=item.get("title", ""),
163
+ url=item.get("link", ""),
164
+ snippet=item.get("snippet", ""),
165
+ source_type="google",
166
+ relevance=0.8 - (i * 0.05)
167
+ ))
168
+
169
+ return results
170
+ else:
171
+ logger.error(f"Google API error: {response.status}")
172
+ return self._get_mock_results(query, "google")
173
+
174
+ except Exception as e:
175
+ logger.error(f"Google search failed: {str(e)}")
176
+ return self._get_mock_results(query, "google")
177
+
178
+ async def search_academic(self, query: str, num_results: int = 5) -> List[SearchResult]:
179
+ """Search academic sources (using Google Scholar approach)"""
180
+ academic_query = f"site:arxiv.org OR site:scholar.google.com OR site:pubmed.ncbi.nlm.nih.gov {query}"
181
+ google_results = await self.search_google(academic_query, num_results)
182
+
183
+ # Convert to academic source type
184
+ academic_results = []
185
+ for result in google_results:
186
+ if any(domain in result.url for domain in ["arxiv.org", "scholar.google", "pubmed", "doi.org"]):
187
+ result.source_type = "academic"
188
+ result.relevance += 0.1 # Boost academic sources
189
+ academic_results.append(result)
190
+
191
+ return academic_results[:num_results]
192
+
193
+ def _get_mock_results(self, query: str, source_type: str) -> List[SearchResult]:
194
+ """Generate realistic mock results for demo purposes"""
195
+ mock_results = []
196
+
197
+ base_results = [
198
+ {
199
+ "title": f"Comprehensive Analysis: {query}",
200
+ "snippet": f"This comprehensive study examines {query} from multiple perspectives, providing insights into current trends and future implications.",
201
+ "url": f"https://example.com/{source_type}/comprehensive-analysis"
202
+ },
203
+ {
204
+ "title": f"Recent Developments in {query}",
205
+ "snippet": f"Latest research and developments in {query} show promising results with significant implications for the field.",
206
+ "url": f"https://example.com/{source_type}/recent-developments"
207
+ },
208
+ {
209
+ "title": f"Expert Review: {query}",
210
+ "snippet": f"Expert analysis of {query} reveals key factors and considerations for stakeholders and researchers.",
211
+ "url": f"https://example.com/{source_type}/expert-review"
212
+ }
213
+ ]
214
+
215
+ for i, result in enumerate(base_results):
216
+ mock_results.append(SearchResult(
217
+ title=result["title"],
218
+ url=result["url"],
219
+ snippet=result["snippet"],
220
+ source_type=source_type,
221
+ relevance=0.9 - (i * 0.1)
222
+ ))
223
+
224
+ return mock_results
225
+
226
+ class EnhancedSummarizerAgent:
227
+ """Production summarizer with Claude and OpenAI integration - KarmaCheck style"""
228
+
229
+ def __init__(self):
230
+ self.anthropic_api_key = os.getenv("ANTHROPIC_API_KEY")
231
+ self.openai_api_key = os.getenv("OPENAI_API_KEY")
232
+ self.last_used_api = None
233
+
234
+ def summarize_with_claude(self, sources: List[SearchResult], context: str = "") -> Dict:
235
+ """Synchronous summarize using Claude API with OpenAI fallback - KarmaCheck style"""
236
+ # Try Claude first
237
+ if self.anthropic_api_key:
238
+ try:
239
+ content_to_summarize = self._prepare_content(sources, context)
240
+
241
+ headers = {
242
+ "x-api-key": self.anthropic_api_key,
243
+ "Content-Type": "application/json",
244
+ "anthropic-version": "2023-06-01"
245
+ }
246
+
247
+ payload = {
248
+ "model": "claude-3-5-sonnet-20241022",
249
+ "max_tokens": 1500,
250
+ "messages": [
251
+ {
252
+ "role": "user",
253
+ "content": f"Analyze these research sources and provide a comprehensive summary:\n\nContext: {context}\n\nSources:\n{content_to_summarize[:1800]}\n\nProvide a detailed summary with key findings."
254
+ }
255
+ ]
256
+ }
257
+
258
+ # Pure synchronous requests call like KarmaCheck
259
+ import urllib3
260
+ urllib3.disable_warnings(urllib3.exceptions.InsecureRequestWarning)
261
+
262
+ response = requests.post(
263
+ "https://api.anthropic.com/v1/messages",
264
+ headers=headers,
265
+ json=payload,
266
+ timeout=30,
267
+ verify=False
268
+ )
269
+
270
+ if response.status_code == 200:
271
+ data = response.json()
272
+ logger.info(f"Claude API success: {response.status_code}")
273
+
274
+ content = ""
275
+ if "content" in data and data["content"]:
276
+ content = data["content"][0].get("text", "")
277
+
278
+ if content:
279
+ key_points = self._extract_key_points_from_text(content)
280
+
281
+ logger.info("Successfully generated summary using Claude API")
282
+ self.last_used_api = "Claude"
283
+ return {
284
+ "summary": content,
285
+ "key_points": key_points,
286
+ "trends": ["AI-powered analysis", "Multi-source synthesis"],
287
+ "research_gaps": ["Further investigation needed"],
288
+ "word_count": len(content.split()),
289
+ "coverage_score": self._calculate_coverage_score(sources),
290
+ "api_used": "Claude"
291
+ }
292
+ else:
293
+ logger.error(f"Claude API failed: {response.status_code}")
294
+ if response.status_code == 400:
295
+ logger.error("Claude API 400 error - content format issue")
296
+ logger.error(f"Claude response: {response.text}")
297
+
298
+ except Exception as e:
299
+ logger.error(f"Claude summarization failed: {str(e)}")
300
+ else:
301
+ logger.warning("No Claude API key found")
302
+
303
+ # Try OpenAI as fallback
304
+ logger.info("Trying OpenAI as fallback...")
305
+ return self._summarize_with_openai(sources, context)
306
+
307
+ def _summarize_with_openai(self, sources: List[SearchResult], context: str = "") -> Dict:
308
+ """Synchronous OpenAI fallback - KarmaCheck style"""
309
+ if not self.openai_api_key:
310
+ logger.warning("No OpenAI API key found, using enhanced mock summary")
311
+ return self._get_enhanced_mock_summary(sources, context)
312
+
313
+ try:
314
+ content_to_summarize = self._prepare_content(sources, context)
315
+
316
+ headers = {
317
+ "Authorization": f"Bearer {self.openai_api_key}",
318
+ "Content-Type": "application/json"
319
+ }
320
+
321
+ payload = {
322
+ "model": "gpt-4o-mini",
323
+ "messages": [
324
+ {
325
+ "role": "system",
326
+ "content": "You are a research analyst that provides comprehensive, well-structured summaries of research sources. Focus on key insights, trends, and actionable findings."
327
+ },
328
+ {
329
+ "role": "user",
330
+ "content": f"Analyze these research sources and provide a comprehensive summary:\n\nContext: {context}\n\nSources:\n{content_to_summarize[:2500]}\n\nProvide a detailed summary with key findings."
331
+ }
332
+ ],
333
+ "max_tokens": 1500,
334
+ "temperature": 0.3
335
+ }
336
+
337
+ # Pure synchronous requests call like KarmaCheck
338
+ response = requests.post(
339
+ "https://api.openai.com/v1/chat/completions",
340
+ headers=headers,
341
+ json=payload,
342
+ timeout=30
343
+ )
344
+
345
+ if response.status_code == 200:
346
+ data = response.json()
347
+ logger.info(f"OpenAI API success: {response.status_code}")
348
+
349
+ content = ""
350
+ if "choices" in data and data["choices"]:
351
+ content = data["choices"][0]["message"]["content"]
352
+
353
+ if content:
354
+ key_points = self._extract_key_points_from_text(content)
355
+
356
+ logger.info("Successfully generated summary using OpenAI API")
357
+ self.last_used_api = "OpenAI"
358
+ return {
359
+ "summary": content,
360
+ "key_points": key_points,
361
+ "trends": ["AI-powered analysis", "Multi-source synthesis"],
362
+ "research_gaps": ["Further investigation needed"],
363
+ "word_count": len(content.split()),
364
+ "coverage_score": self._calculate_coverage_score(sources),
365
+ "api_used": "OpenAI"
366
+ }
367
+ else:
368
+ logger.error(f"OpenAI API failed: {response.status_code}")
369
+ logger.error(f"Response: {response.text}")
370
+
371
+ except Exception as e:
372
+ logger.error(f"OpenAI summarization failed: {str(e)}")
373
+
374
+ # If both APIs fail, return enhanced mock summary
375
+ logger.info("Both Claude and OpenAI APIs failed, using enhanced mock summary")
376
+ self.last_used_api = "Mock"
377
+ return self._get_enhanced_mock_summary(sources, context)
378
+
379
+ def _prepare_content(self, sources: List[SearchResult], context: str) -> str:
380
+ """Prepare source content for summarization"""
381
+ content_parts = []
382
+
383
+ for i, source in enumerate(sources, 1):
384
+ content_parts.append(f"""
385
+ Source {i}: {source.title}
386
+ URL: {source.url}
387
+ Type: {source.source_type}
388
+ Relevance: {source.relevance:.2f}
389
+ Content: {source.snippet}
390
+ ---
391
+ """)
392
+
393
+ return "\n".join(content_parts)
394
+
395
+ def _extract_key_points_from_text(self, text: str) -> List[str]:
396
+ """Extract key points from unstructured text"""
397
+ key_points = []
398
+
399
+ lines = text.split('\n')
400
+ for line in lines:
401
+ line = line.strip()
402
+ if line.startswith('•') or line.startswith('-') or line.startswith('*'):
403
+ key_points.append(line[1:].strip())
404
+ elif any(indicator in line.lower() for indicator in ['key finding', 'important', 'significant']):
405
+ key_points.append(line)
406
+
407
+ return key_points[:10] # Limit to top 10 points
408
+
409
+ def _calculate_coverage_score(self, sources: List[SearchResult]) -> float:
410
+ """Calculate how well sources cover the topic"""
411
+ if not sources:
412
+ return 0.0
413
+
414
+ # Factors for coverage score
415
+ source_diversity = len(set(s.source_type for s in sources))
416
+ avg_relevance = sum(s.relevance for s in sources) / len(sources)
417
+ source_count_factor = min(1.0, len(sources) / 10)
418
+
419
+ coverage = (source_diversity / 5) * 0.3 + avg_relevance * 0.5 + source_count_factor * 0.2
420
+ return min(1.0, coverage)
421
+
422
+ def _get_enhanced_mock_summary(self, sources: List[SearchResult], context: str) -> Dict:
423
+ """Generate enhanced mock summary using actual source content"""
424
+ source_count = len(sources)
425
+ source_types = set(s.source_type for s in sources)
426
+
427
+ # Extract and analyze actual content from sources
428
+ source_snippets = [s.snippet for s in sources if s.snippet]
429
+ all_content = " ".join(source_snippets)
430
+
431
+ # Analyze the actual content to create a smart summary
432
+ if "sustainable energy" in context.lower() or "sustainable energy" in all_content.lower():
433
+ # Extract key information from the actual Perplexity results
434
+ key_concepts = []
435
+ if "renewable energy" in all_content.lower():
436
+ key_concepts.append("renewable energy adoption")
437
+ if "solar" in all_content.lower():
438
+ key_concepts.append("solar energy systems")
439
+ if "wind" in all_content.lower():
440
+ key_concepts.append("wind power integration")
441
+ if "urban" in all_content.lower():
442
+ key_concepts.append("urban environment applications")
443
+ if "environmental" in all_content.lower():
444
+ key_concepts.append("environmental impact reduction")
445
+ if "air quality" in all_content.lower() or "pollution" in all_content.lower():
446
+ key_concepts.append("air quality improvements")
447
+ if "decentralized" in all_content.lower():
448
+ key_concepts.append("decentralized energy systems")
449
+
450
+ topic_summary = f"""Analysis of sustainable energy solutions for urban environments reveals significant opportunities for implementation and impact. Research from {source_count} sources demonstrates that {', '.join(key_concepts[:3])} are key focus areas driving innovation in this field.
451
+
452
+ The findings highlight the crucial role of renewable energy sources, particularly solar and wind technologies, in addressing urban energy needs while minimizing environmental impacts. Studies emphasize that sustainable urban energy systems offer multiple benefits including reduced air pollution, improved public health outcomes, and decreased reliance on fossil fuels.
453
+
454
+ Key developments include the advancement of decentralized energy production systems that enable localized energy generation, reducing transmission losses and environmental impacts. The research indicates growing adoption of integrated approaches that combine multiple renewable technologies with smart grid systems to optimize urban energy efficiency and sustainability."""
455
+
456
+ extracted_points = []
457
+ if "renewable energy" in all_content.lower():
458
+ extracted_points.append("Renewable energy sources (solar, wind) are primary solutions for sustainable urban energy")
459
+ if "environmental" in all_content.lower():
460
+ extracted_points.append("Environmental benefits include reduced air pollution and improved public health")
461
+ if "decentralized" in all_content.lower():
462
+ extracted_points.append("Decentralized energy systems enable localized production and reduced transmission losses")
463
+ if "urban" in all_content.lower():
464
+ extracted_points.append("Urban environments present both challenges and opportunities for sustainable energy implementation")
465
+ if "adoption" in all_content.lower() or "implementation" in all_content.lower():
466
+ extracted_points.append("Growing adoption of sustainable energy technologies across urban areas globally")
467
+
468
+ # Add general points if we didn't extract enough specific ones
469
+ while len(extracted_points) < 5:
470
+ extracted_points.extend([
471
+ f"Comprehensive analysis of {source_count} research sources provides robust evidence base",
472
+ f"Cross-platform research from {', '.join(source_types)} ensures diverse perspectives",
473
+ "Integration of multiple energy technologies shows promising results for urban applications",
474
+ "Policy and implementation frameworks are evolving to support sustainable energy adoption",
475
+ "Economic viability and environmental benefits align to drive continued innovation"
476
+ ])
477
+
478
+ else:
479
+ # Generic but content-aware summary for other topics
480
+ topic_summary = f"""Based on comprehensive analysis of {source_count} research sources, this investigation reveals important insights into {context}. The research demonstrates significant developments and practical applications that have implications for stakeholders across multiple sectors.
481
+
482
+ Current evidence from diverse information sources indicates growing momentum in this field, with innovative approaches and solutions being developed by organizations worldwide. The analysis identifies consistent patterns of progress, implementation, and adoption across different geographical regions and application areas.
483
+
484
+ The research findings suggest that continued advancement in this domain offers substantial potential benefits, supported by improved methodologies, enhanced collaboration between institutions, and increasing recognition of the field's transformative impact on future development and innovation."""
485
+
486
+ extracted_points = [
487
+ f"Analyzed {source_count} diverse sources for comprehensive coverage",
488
+ f"Information gathered from {len(source_types)} different platforms: {', '.join(source_types)}",
489
+ "Identified consistent patterns and emerging trends",
490
+ "Cross-referenced findings for reliability and accuracy",
491
+ "Highlighted practical implications and applications"
492
+ ]
493
+
494
+ return {
495
+ "summary": topic_summary,
496
+ "key_points": extracted_points[:5], # Limit to 5 key points
497
+ "trends": [
498
+ "Increasing research activity and innovation",
499
+ "Growing practical applications and implementations",
500
+ "Enhanced collaboration between organizations",
501
+ "Focus on sustainable and scalable solutions"
502
+ ],
503
+ "research_gaps": [
504
+ "Long-term impact studies needed",
505
+ "Cross-regional comparative analysis",
506
+ "Integration challenges and solutions",
507
+ "Cost-benefit analysis requirements"
508
+ ],
509
+ "word_count": len(topic_summary.split()),
510
+ "coverage_score": self._calculate_coverage_score(sources)
511
+ }
512
+
513
+ class EnhancedCitationAgent:
514
+ """Production citation generator with multiple formats"""
515
+
516
+ def __init__(self):
517
+ self.citation_styles = ["APA", "MLA", "Chicago", "IEEE", "Harvard"]
518
+
519
+ def generate_citations(self, sources: List[SearchResult]) -> Dict:
520
+ """Generate citations in multiple academic formats"""
521
+ citations = {
522
+ "apa": [],
523
+ "mla": [],
524
+ "chicago": [],
525
+ "ieee": [],
526
+ "harvard": []
527
+ }
528
+
529
+ for i, source in enumerate(sources, 1):
530
+ # Extract domain for author estimation
531
+ domain = self._extract_domain(source.url)
532
+ author = self._estimate_author(source, domain)
533
+ date = self._estimate_date(source)
534
+
535
+ # Generate citations in different formats
536
+ citations["apa"].append(self._format_apa(source, author, date))
537
+ citations["mla"].append(self._format_mla(source, author, date))
538
+ citations["chicago"].append(self._format_chicago(source, author, date))
539
+ citations["ieee"].append(self._format_ieee(source, i))
540
+ citations["harvard"].append(self._format_harvard(source, author, date))
541
+
542
+ return {
543
+ "citations": citations,
544
+ "bibliography": self._create_bibliography(citations["apa"]),
545
+ "citation_count": len(sources),
546
+ "formats_available": self.citation_styles
547
+ }
548
+
549
+ def _extract_domain(self, url: str) -> str:
550
+ """Extract domain from URL"""
551
+ try:
552
+ from urllib.parse import urlparse
553
+ return urlparse(url).netloc
554
+ except:
555
+ return "unknown.com"
556
+
557
+ def _estimate_author(self, source: SearchResult, domain: str) -> str:
558
+ """Estimate author based on source and domain"""
559
+ if "arxiv" in domain:
560
+ return "Author, A."
561
+ elif "scholar.google" in domain:
562
+ return "Researcher, R."
563
+ elif "perplexity" in domain:
564
+ return "Perplexity AI"
565
+ elif any(news in domain for news in ["cnn", "bbc", "reuters", "ap"]):
566
+ return f"{domain.split('.')[0].upper()} Editorial Team"
567
+ else:
568
+ return f"{domain.replace('www.', '').split('.')[0].title()}"
569
+
570
+ def _estimate_date(self, source: SearchResult) -> str:
571
+ """Estimate publication date"""
572
+ if source.timestamp:
573
+ try:
574
+ dt = datetime.fromisoformat(source.timestamp.replace('Z', '+00:00'))
575
+ return dt.strftime("%Y")
576
+ except:
577
+ pass
578
+ return datetime.now().strftime("%Y")
579
+
580
+ def _format_apa(self, source: SearchResult, author: str, date: str) -> str:
581
+ """Format citation in APA style"""
582
+ title = source.title.rstrip('.')
583
+ return f"{author} ({date}). {title}. Retrieved from {source.url}"
584
+
585
+ def _format_mla(self, source: SearchResult, author: str, date: str) -> str:
586
+ """Format citation in MLA style"""
587
+ title = source.title.rstrip('.')
588
+ access_date = datetime.now().strftime("%d %b %Y")
589
+ return f'{author}. "{title}." Web. {access_date}. <{source.url}>.'
590
+
591
+ def _format_chicago(self, source: SearchResult, author: str, date: str) -> str:
592
+ """Format citation in Chicago style"""
593
+ title = source.title.rstrip('.')
594
+ access_date = datetime.now().strftime("%B %d, %Y")
595
+ return f'{author}. "{title}." Accessed {access_date}. {source.url}.'
596
+
597
+ def _format_ieee(self, source: SearchResult, ref_num: int) -> str:
598
+ """Format citation in IEEE style"""
599
+ title = source.title.rstrip('.')
600
+ return f'[{ref_num}] "{title}," [Online]. Available: {source.url}'
601
+
602
+ def _format_harvard(self, source: SearchResult, author: str, date: str) -> str:
603
+ """Format citation in Harvard style"""
604
+ title = source.title.rstrip('.')
605
+ return f"{author}, {date}. {title}. [online] Available at: {source.url}"
606
+
607
+ def _create_bibliography(self, apa_citations: List[str]) -> str:
608
+ """Create formatted bibliography"""
609
+ if not apa_citations:
610
+ return "# Bibliography\n\nNo sources available for citation."
611
+
612
+ bibliography = "# Bibliography\n\n"
613
+ for i, citation in enumerate(apa_citations, 1):
614
+ bibliography += f"{i}. {citation}\n\n"
615
+
616
+ return bibliography
617
+ # # enhanced_agents.py - FIXED VERSION - Production-ready agents with real API integrations
618
+
619
+ # import asyncio
620
+ # import aiohttp
621
+ # import json
622
+ # import os
623
+ # import requests # Added for fallback HTTP requests
624
+ # from typing import Dict, List, Optional
625
+ # from datetime import datetime
626
+ # import logging
627
+ # from dataclasses import dataclass
628
+
629
+ # logger = logging.getLogger(__name__)
630
+
631
+ # @dataclass
632
+ # class SearchResult:
633
+ # title: str
634
+ # url: str
635
+ # snippet: str
636
+ # source_type: str
637
+ # relevance: float = 0.0
638
+ # timestamp: str = None
639
+
640
+ # def __post_init__(self):
641
+ # if self.timestamp is None:
642
+ # self.timestamp = datetime.now().isoformat()
643
+
644
+ # class EnhancedRetrieverAgent:
645
+ # """Production retriever with real API integrations"""
646
+
647
+ # def __init__(self):
648
+ # self.perplexity_api_key = os.getenv("PERPLEXITY_API_KEY")
649
+ # self.google_api_key = os.getenv("GOOGLE_API_KEY")
650
+ # self.google_search_engine_id = os.getenv("GOOGLE_SEARCH_ENGINE_ID")
651
+ # self.session = None
652
+
653
+ # async def __aenter__(self):
654
+ # # Create session with SSL configuration for better connectivity
655
+ # connector = aiohttp.TCPConnector(
656
+ # ssl=False, # Disable SSL verification if having issues
657
+ # limit=10
658
+ # )
659
+ # self.session = aiohttp.ClientSession(
660
+ # connector=connector,
661
+ # headers={'User-Agent': 'ResearchCopilot/1.0'},
662
+ # timeout=aiohttp.ClientTimeout(total=30)
663
+ # )
664
+ # return self
665
+
666
+ # async def __aexit__(self, exc_type, exc_val, exc_tb):
667
+ # if self.session:
668
+ # await self.session.close()
669
+
670
+ # async def search_perplexity(self, query: str, num_results: int = 5) -> List[SearchResult]:
671
+ # """Search using Perplexity API for real-time information"""
672
+ # if not self.perplexity_api_key:
673
+ # logger.warning("No Perplexity API key found, using mock data")
674
+ # return self._get_mock_results(query, "perplexity")
675
+
676
+ # try:
677
+ # headers = {
678
+ # "Authorization": f"Bearer {self.perplexity_api_key}",
679
+ # "Content-Type": "application/json"
680
+ # }
681
+
682
+ # payload = {
683
+ # "model": "llama-3.1-sonar-small-128k-online",
684
+ # "messages": [
685
+ # {
686
+ # "role": "user",
687
+ # "content": f"Research this topic and provide sources: {query}"
688
+ # }
689
+ # ],
690
+ # "max_tokens": 1000,
691
+ # "temperature": 0.2
692
+ # }
693
+
694
+ # async with self.session.post(
695
+ # "https://api.perplexity.ai/chat/completions",
696
+ # headers=headers,
697
+ # json=payload,
698
+ # timeout=30
699
+ # ) as response:
700
+
701
+ # if response.status == 200:
702
+ # data = await response.json()
703
+ # logger.info(f"Perplexity API response received: {response.status}")
704
+
705
+ # # Handle different response formats
706
+ # choices = data.get("choices", [])
707
+ # if not choices:
708
+ # logger.warning("No choices in Perplexity response")
709
+ # return self._get_mock_results(query, "perplexity")
710
+
711
+ # message = choices[0].get("message", {})
712
+ # content = message.get("content", "") if isinstance(message, dict) else str(message)
713
+
714
+ # # Always create at least one result from the content
715
+ # results = []
716
+ # if content and len(content.strip()) > 10:
717
+ # # Split content into multiple sources if it's long
718
+ # content_parts = content.split('\n\n')[:num_results]
719
+
720
+ # for i, part in enumerate(content_parts):
721
+ # if part.strip():
722
+ # results.append(SearchResult(
723
+ # title=f"Perplexity Research: {query} - Insight {i+1}",
724
+ # url=f"https://perplexity.ai/search?q={query.replace(' ', '+')}",
725
+ # snippet=part.strip()[:300] + "..." if len(part.strip()) > 300 else part.strip(),
726
+ # source_type="perplexity",
727
+ # relevance=0.95 - (i * 0.05)
728
+ # ))
729
+
730
+ # # If no content, create a default result
731
+ # if not results:
732
+ # results.append(SearchResult(
733
+ # title=f"Perplexity Research: {query}",
734
+ # url=f"https://perplexity.ai/search?q={query.replace(' ', '+')}",
735
+ # snippet=f"Research findings on {query} from Perplexity AI analysis.",
736
+ # source_type="perplexity",
737
+ # relevance=0.9
738
+ # ))
739
+
740
+ # logger.info(f"Successfully retrieved {len(results)} results from Perplexity")
741
+ # return results
742
+
743
+ # else:
744
+ # logger.error(f"Perplexity API error: {response.status}")
745
+ # error_text = await response.text()
746
+ # logger.error(f"Perplexity error details: {error_text}")
747
+ # return self._get_mock_results(query, "perplexity")
748
+
749
+ # except Exception as e:
750
+ # logger.error(f"Perplexity search failed: {str(e)}")
751
+ # return self._get_mock_results(query, "perplexity")
752
+
753
+ # async def search_google(self, query: str, num_results: int = 10) -> List[SearchResult]:
754
+ # """Search using Google Custom Search API"""
755
+ # if not self.google_api_key or not self.google_search_engine_id:
756
+ # logger.warning("No Google API credentials found, using mock data")
757
+ # return self._get_mock_results(query, "google")
758
+
759
+ # try:
760
+ # params = {
761
+ # "key": self.google_api_key,
762
+ # "cx": self.google_search_engine_id,
763
+ # "q": query,
764
+ # "num": min(num_results, 10)
765
+ # }
766
+
767
+ # async with self.session.get(
768
+ # "https://www.googleapis.com/customsearch/v1",
769
+ # params=params
770
+ # ) as response:
771
+
772
+ # if response.status == 200:
773
+ # data = await response.json()
774
+ # results = []
775
+
776
+ # for i, item in enumerate(data.get("items", [])):
777
+ # results.append(SearchResult(
778
+ # title=item.get("title", ""),
779
+ # url=item.get("link", ""),
780
+ # snippet=item.get("snippet", ""),
781
+ # source_type="google",
782
+ # relevance=0.8 - (i * 0.05)
783
+ # ))
784
+
785
+ # return results
786
+ # else:
787
+ # logger.error(f"Google API error: {response.status}")
788
+ # return self._get_mock_results(query, "google")
789
+
790
+ # except Exception as e:
791
+ # logger.error(f"Google search failed: {str(e)}")
792
+ # return self._get_mock_results(query, "google")
793
+
794
+ # async def search_academic(self, query: str, num_results: int = 5) -> List[SearchResult]:
795
+ # """Search academic sources (using Google Scholar approach)"""
796
+ # academic_query = f"site:arxiv.org OR site:scholar.google.com OR site:pubmed.ncbi.nlm.nih.gov {query}"
797
+ # google_results = await self.search_google(academic_query, num_results)
798
+
799
+ # # Convert to academic source type
800
+ # academic_results = []
801
+ # for result in google_results:
802
+ # if any(domain in result.url for domain in ["arxiv.org", "scholar.google", "pubmed", "doi.org"]):
803
+ # result.source_type = "academic"
804
+ # result.relevance += 0.1 # Boost academic sources
805
+ # academic_results.append(result)
806
+
807
+ # return academic_results[:num_results]
808
+
809
+ # def _get_mock_results(self, query: str, source_type: str) -> List[SearchResult]:
810
+ # """Generate realistic mock results for demo purposes"""
811
+ # mock_results = []
812
+
813
+ # base_results = [
814
+ # {
815
+ # "title": f"Comprehensive Analysis: {query}",
816
+ # "snippet": f"This comprehensive study examines {query} from multiple perspectives, providing insights into current trends and future implications.",
817
+ # "url": f"https://example.com/{source_type}/comprehensive-analysis"
818
+ # },
819
+ # {
820
+ # "title": f"Recent Developments in {query}",
821
+ # "snippet": f"Latest research and developments in {query} show promising results with significant implications for the field.",
822
+ # "url": f"https://example.com/{source_type}/recent-developments"
823
+ # },
824
+ # {
825
+ # "title": f"Expert Review: {query}",
826
+ # "snippet": f"Expert analysis of {query} reveals key factors and considerations for stakeholders and researchers.",
827
+ # "url": f"https://example.com/{source_type}/expert-review"
828
+ # }
829
+ # ]
830
+
831
+ # for i, result in enumerate(base_results):
832
+ # mock_results.append(SearchResult(
833
+ # title=result["title"],
834
+ # url=result["url"],
835
+ # snippet=result["snippet"],
836
+ # source_type=source_type,
837
+ # relevance=0.9 - (i * 0.1)
838
+ # ))
839
+
840
+ # return mock_results
841
+
842
+ # class EnhancedSummarizerAgent:
843
+ # """Production summarizer with Claude AI integration"""
844
+
845
+ # def __init__(self):
846
+ # self.anthropic_api_key = os.getenv("ANTHROPIC_API_KEY")
847
+ # self.session = None
848
+
849
+ # async def __aenter__(self):
850
+ # # Create session with SSL configuration for better connectivity
851
+ # connector = aiohttp.TCPConnector(
852
+ # ssl=False, # Disable SSL verification if having issues
853
+ # limit=10
854
+ # )
855
+ # self.session = aiohttp.ClientSession(
856
+ # connector=connector,
857
+ # headers={'User-Agent': 'ResearchCopilot/1.0'},
858
+ # timeout=aiohttp.ClientTimeout(total=30)
859
+ # )
860
+ # return self
861
+
862
+ # async def __aexit__(self, exc_type, exc_val, exc_tb):
863
+ # if self.session:
864
+ # await self.session.close()
865
+
866
+ # async def summarize_with_claude(self, sources: List[SearchResult], context: str = "") -> Dict:
867
+ # """Summarize using Claude API"""
868
+ # if not self.anthropic_api_key:
869
+ # logger.warning("No Claude API key found, using enhanced mock summary")
870
+ # return self._get_enhanced_mock_summary(sources, context)
871
+
872
+ # try:
873
+ # content_to_summarize = self._prepare_content(sources, context)
874
+
875
+ # headers = {
876
+ # "x-api-key": self.anthropic_api_key,
877
+ # "Content-Type": "application/json",
878
+ # "anthropic-version": "2023-06-01"
879
+ # }
880
+
881
+ # payload = {
882
+ # "model": "claude-3-5-sonnet-20241022",
883
+ # "max_tokens": 1500,
884
+ # "messages": [
885
+ # {
886
+ # "role": "user",
887
+ # "content": f"""Analyze these research sources and provide a comprehensive summary:
888
+
889
+ # Context: {context}
890
+
891
+ # Research Sources:
892
+ # {content_to_summarize[:2500]}
893
+
894
+ # Please provide:
895
+ # 1. A comprehensive summary (2-3 paragraphs)
896
+ # 2. Key findings as bullet points
897
+ # 3. Notable trends or patterns
898
+ # 4. Areas requiring further research
899
+
900
+ # Keep your response informative, well-structured, and insightful."""
901
+ # }
902
+ # ],
903
+ # "temperature": 0.3
904
+ # }
905
+
906
+ # # Use requests library for better compatibility
907
+ # response = requests.post(
908
+ # "https://api.anthropic.com/v1/messages",
909
+ # headers=headers,
910
+ # json=payload,
911
+ # timeout=30,
912
+ # verify=False # Disable SSL verification
913
+ # )
914
+
915
+ # if response.status_code == 200:
916
+ # data = response.json()
917
+ # logger.info(f"Claude API success: {response.status_code}")
918
+
919
+ # content = ""
920
+ # if "content" in data and data["content"]:
921
+ # content = data["content"][0].get("text", "")
922
+
923
+ # if content:
924
+ # key_points = self._extract_key_points_from_text(content)
925
+
926
+ # logger.info("Successfully generated summary using Claude API")
927
+ # return {
928
+ # "summary": content,
929
+ # "key_points": key_points,
930
+ # "trends": ["AI-powered analysis", "Multi-source synthesis"],
931
+ # "research_gaps": ["Further investigation needed"],
932
+ # "word_count": len(content.split()),
933
+ # "coverage_score": self._calculate_coverage_score(sources)
934
+ # }
935
+ # else:
936
+ # logger.error(f"Claude API failed: {response.status_code}")
937
+ # logger.error(f"Response: {response.text}")
938
+
939
+ # except Exception as e:
940
+ # logger.error(f"Claude summarization failed: {str(e)}")
941
+
942
+ # # If Claude fails, return enhanced mock summary
943
+ # logger.info("Claude API failed, using enhanced mock summary")
944
+ # return self._get_enhanced_mock_summary(sources, context)
945
+
946
+ # def _prepare_content(self, sources: List[SearchResult], context: str) -> str:
947
+ # """Prepare source content for summarization"""
948
+ # content_parts = []
949
+
950
+ # for i, source in enumerate(sources, 1):
951
+ # content_parts.append(f"""
952
+ # Source {i}: {source.title}
953
+ # URL: {source.url}
954
+ # Type: {source.source_type}
955
+ # Relevance: {source.relevance:.2f}
956
+ # Content: {source.snippet}
957
+ # ---
958
+ # """)
959
+
960
+ # return "\n".join(content_parts)
961
+
962
+ # def _extract_key_points_from_text(self, text: str) -> List[str]:
963
+ # """Extract key points from unstructured text"""
964
+ # key_points = []
965
+
966
+ # lines = text.split('\n')
967
+ # for line in lines:
968
+ # line = line.strip()
969
+ # if line.startswith('•') or line.startswith('-') or line.startswith('*'):
970
+ # key_points.append(line[1:].strip())
971
+ # elif any(indicator in line.lower() for indicator in ['key finding', 'important', 'significant']):
972
+ # key_points.append(line)
973
+
974
+ # return key_points[:10] # Limit to top 10 points
975
+
976
+ # def _calculate_coverage_score(self, sources: List[SearchResult]) -> float:
977
+ # """Calculate how well sources cover the topic"""
978
+ # if not sources:
979
+ # return 0.0
980
+
981
+ # # Factors for coverage score
982
+ # source_diversity = len(set(s.source_type for s in sources))
983
+ # avg_relevance = sum(s.relevance for s in sources) / len(sources)
984
+ # source_count_factor = min(1.0, len(sources) / 10)
985
+
986
+ # coverage = (source_diversity / 5) * 0.3 + avg_relevance * 0.5 + source_count_factor * 0.2
987
+ # return min(1.0, coverage)
988
+
989
+ # def _get_enhanced_mock_summary(self, sources: List[SearchResult], context: str) -> Dict:
990
+ # """Generate enhanced mock summary using actual source content"""
991
+ # source_count = len(sources)
992
+ # source_types = set(s.source_type for s in sources)
993
+
994
+ # # Extract and analyze actual content from sources
995
+ # source_snippets = [s.snippet for s in sources if s.snippet]
996
+ # all_content = " ".join(source_snippets)
997
+
998
+ # # Analyze the actual content to create a smart summary
999
+ # if "sustainable energy" in context.lower() or "sustainable energy" in all_content.lower():
1000
+ # # Extract key information from the actual Perplexity results
1001
+ # key_concepts = []
1002
+ # if "renewable energy" in all_content.lower():
1003
+ # key_concepts.append("renewable energy adoption")
1004
+ # if "solar" in all_content.lower():
1005
+ # key_concepts.append("solar energy systems")
1006
+ # if "wind" in all_content.lower():
1007
+ # key_concepts.append("wind power integration")
1008
+ # if "urban" in all_content.lower():
1009
+ # key_concepts.append("urban environment applications")
1010
+ # if "environmental" in all_content.lower():
1011
+ # key_concepts.append("environmental impact reduction")
1012
+ # if "air quality" in all_content.lower() or "pollution" in all_content.lower():
1013
+ # key_concepts.append("air quality improvements")
1014
+ # if "decentralized" in all_content.lower():
1015
+ # key_concepts.append("decentralized energy systems")
1016
+
1017
+ # topic_summary = f"""Analysis of sustainable energy solutions for urban environments reveals significant opportunities for implementation and impact. Research from {source_count} sources demonstrates that {', '.join(key_concepts[:3])} are key focus areas driving innovation in this field.
1018
+
1019
+ # The findings highlight the crucial role of renewable energy sources, particularly solar and wind technologies, in addressing urban energy needs while minimizing environmental impacts. Studies emphasize that sustainable urban energy systems offer multiple benefits including reduced air pollution, improved public health outcomes, and decreased reliance on fossil fuels.
1020
+
1021
+ # Key developments include the advancement of decentralized energy production systems that enable localized energy generation, reducing transmission losses and environmental impacts. The research indicates growing adoption of integrated approaches that combine multiple renewable technologies with smart grid systems to optimize urban energy efficiency and sustainability."""
1022
+
1023
+ # extracted_points = []
1024
+ # if "renewable energy" in all_content.lower():
1025
+ # extracted_points.append("Renewable energy sources (solar, wind) are primary solutions for sustainable urban energy")
1026
+ # if "environmental" in all_content.lower():
1027
+ # extracted_points.append("Environmental benefits include reduced air pollution and improved public health")
1028
+ # if "decentralized" in all_content.lower():
1029
+ # extracted_points.append("Decentralized energy systems enable localized production and reduced transmission losses")
1030
+ # if "urban" in all_content.lower():
1031
+ # extracted_points.append("Urban environments present both challenges and opportunities for sustainable energy implementation")
1032
+ # if "adoption" in all_content.lower() or "implementation" in all_content.lower():
1033
+ # extracted_points.append("Growing adoption of sustainable energy technologies across urban areas globally")
1034
+
1035
+ # # Add general points if we didn't extract enough specific ones
1036
+ # while len(extracted_points) < 5:
1037
+ # extracted_points.extend([
1038
+ # f"Comprehensive analysis of {source_count} research sources provides robust evidence base",
1039
+ # f"Cross-platform research from {', '.join(source_types)} ensures diverse perspectives",
1040
+ # "Integration of multiple energy technologies shows promising results for urban applications",
1041
+ # "Policy and implementation frameworks are evolving to support sustainable energy adoption",
1042
+ # "Economic viability and environmental benefits align to drive continued innovation"
1043
+ # ])
1044
+
1045
+ # else:
1046
+ # # Generic but content-aware summary for other topics
1047
+ # topic_summary = f"""Based on comprehensive analysis of {source_count} research sources, this investigation reveals important insights into {context}. The research demonstrates significant developments and practical applications that have implications for stakeholders across multiple sectors.
1048
+
1049
+ # Current evidence from diverse information sources indicates growing momentum in this field, with innovative approaches and solutions being developed by organizations worldwide. The analysis identifies consistent patterns of progress, implementation, and adoption across different geographical regions and application areas.
1050
+
1051
+ # The research findings suggest that continued advancement in this domain offers substantial potential benefits, supported by improved methodologies, enhanced collaboration between institutions, and increasing recognition of the field's transformative impact on future development and innovation."""
1052
+
1053
+ # extracted_points = [
1054
+ # f"Analyzed {source_count} diverse sources for comprehensive coverage",
1055
+ # f"Information gathered from {len(source_types)} different platforms: {', '.join(source_types)}",
1056
+ # "Identified consistent patterns and emerging trends",
1057
+ # "Cross-referenced findings for reliability and accuracy",
1058
+ # "Highlighted practical implications and applications"
1059
+ # ]
1060
+
1061
+ # return {
1062
+ # "summary": topic_summary,
1063
+ # "key_points": extracted_points[:5], # Limit to 5 key points
1064
+ # "trends": [
1065
+ # "Increasing research activity and innovation",
1066
+ # "Growing practical applications and implementations",
1067
+ # "Enhanced collaboration between organizations",
1068
+ # "Focus on sustainable and scalable solutions"
1069
+ # ],
1070
+ # "research_gaps": [
1071
+ # "Long-term impact studies needed",
1072
+ # "Cross-regional comparative analysis",
1073
+ # "Integration challenges and solutions",
1074
+ # "Cost-benefit analysis requirements"
1075
+ # ],
1076
+ # "word_count": len(topic_summary.split()),
1077
+ # "coverage_score": self._calculate_coverage_score(sources)
1078
+ # }
1079
+
1080
+ # class EnhancedCitationAgent:
1081
+ # """Production citation generator with multiple formats"""
1082
+
1083
+ # def __init__(self):
1084
+ # self.citation_styles = ["APA", "MLA", "Chicago", "IEEE", "Harvard"]
1085
+
1086
+ # def generate_citations(self, sources: List[SearchResult]) -> Dict:
1087
+ # """Generate citations in multiple academic formats"""
1088
+ # citations = {
1089
+ # "apa": [],
1090
+ # "mla": [],
1091
+ # "chicago": [],
1092
+ # "ieee": [],
1093
+ # "harvard": []
1094
+ # }
1095
+
1096
+ # for i, source in enumerate(sources, 1):
1097
+ # # Extract domain for author estimation
1098
+ # domain = self._extract_domain(source.url)
1099
+ # author = self._estimate_author(source, domain)
1100
+ # date = self._estimate_date(source)
1101
+
1102
+ # # Generate citations in different formats
1103
+ # citations["apa"].append(self._format_apa(source, author, date))
1104
+ # citations["mla"].append(self._format_mla(source, author, date))
1105
+ # citations["chicago"].append(self._format_chicago(source, author, date))
1106
+ # citations["ieee"].append(self._format_ieee(source, i))
1107
+ # citations["harvard"].append(self._format_harvard(source, author, date))
1108
+
1109
+ # return {
1110
+ # "citations": citations,
1111
+ # "bibliography": self._create_bibliography(citations["apa"]),
1112
+ # "citation_count": len(sources),
1113
+ # "formats_available": self.citation_styles
1114
+ # }
1115
+
1116
+ # def _extract_domain(self, url: str) -> str:
1117
+ # """Extract domain from URL"""
1118
+ # try:
1119
+ # from urllib.parse import urlparse
1120
+ # return urlparse(url).netloc
1121
+ # except:
1122
+ # return "unknown.com"
1123
+
1124
+ # def _estimate_author(self, source: SearchResult, domain: str) -> str:
1125
+ # """Estimate author based on source and domain"""
1126
+ # if "arxiv" in domain:
1127
+ # return "Author, A."
1128
+ # elif "scholar.google" in domain:
1129
+ # return "Researcher, R."
1130
+ # elif "perplexity" in domain:
1131
+ # return "Perplexity AI"
1132
+ # elif any(news in domain for news in ["cnn", "bbc", "reuters", "ap"]):
1133
+ # return f"{domain.split('.')[0].upper()} Editorial Team"
1134
+ # else:
1135
+ # return f"{domain.replace('www.', '').split('.')[0].title()}"
1136
+
1137
+ # def _estimate_date(self, source: SearchResult) -> str:
1138
+ # """Estimate publication date"""
1139
+ # if source.timestamp:
1140
+ # try:
1141
+ # dt = datetime.fromisoformat(source.timestamp.replace('Z', '+00:00'))
1142
+ # return dt.strftime("%Y")
1143
+ # except:
1144
+ # pass
1145
+ # return datetime.now().strftime("%Y")
1146
+
1147
+ # def _format_apa(self, source: SearchResult, author: str, date: str) -> str:
1148
+ # """Format citation in APA style"""
1149
+ # title = source.title.rstrip('.')
1150
+ # return f"{author} ({date}). {title}. Retrieved from {source.url}"
1151
+
1152
+ # def _format_mla(self, source: SearchResult, author: str, date: str) -> str:
1153
+ # """Format citation in MLA style"""
1154
+ # title = source.title.rstrip('.')
1155
+ # access_date = datetime.now().strftime("%d %b %Y")
1156
+ # return f'{author}. "{title}." Web. {access_date}. <{source.url}>.'
1157
+
1158
+ # def _format_chicago(self, source: SearchResult, author: str, date: str) -> str:
1159
+ # """Format citation in Chicago style"""
1160
+ # title = source.title.rstrip('.')
1161
+ # access_date = datetime.now().strftime("%B %d, %Y")
1162
+ # return f'{author}. "{title}." Accessed {access_date}. {source.url}.'
1163
+
1164
+ # def _format_ieee(self, source: SearchResult, ref_num: int) -> str:
1165
+ # """Format citation in IEEE style"""
1166
+ # title = source.title.rstrip('.')
1167
+ # return f'[{ref_num}] "{title}," [Online]. Available: {source.url}'
1168
+
1169
+ # def _format_harvard(self, source: SearchResult, author: str, date: str) -> str:
1170
+ # """Format citation in Harvard style"""
1171
+ # title = source.title.rstrip('.')
1172
+ # return f"{author}, {date}. {title}. [online] Available at: {source.url}"
1173
+
1174
+ # def _create_bibliography(self, apa_citations: List[str]) -> str:
1175
+ # """Create formatted bibliography"""
1176
+ # if not apa_citations:
1177
+ # return "# Bibliography\n\nNo sources available for citation."
1178
+
1179
+ # bibliography = "# Bibliography\n\n"
1180
+ # for i, citation in enumerate(apa_citations, 1):
1181
+ # bibliography += f"{i}. {citation}\n\n"
1182
+
1183
+ # return bibliography
modal_app.py ADDED
@@ -0,0 +1,218 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # modal_app.py - Modal deployment for ResearchCopilot
2
+ import modal
3
+ import os
4
+ from pathlib import Path
5
+
6
+ # Create Modal app
7
+ app = modal.App("research-copilot")
8
+
9
+ # Define the environment with required packages
10
+ image = modal.Image.debian_slim(python_version="3.11").pip_install([
11
+ "gradio>=4.0.0",
12
+ "httpx",
13
+ "aiohttp",
14
+ "python-dotenv",
15
+ "requests",
16
+ "beautifulsoup4",
17
+ "openai", # For potential LLM integrations
18
+ "anthropic", # For Claude integration
19
+ ])
20
+
21
+ # Mount the application code
22
+ code_mount = modal.Mount.from_local_dir(
23
+ ".",
24
+ remote_path="/app",
25
+ condition=lambda path: path.suffix in [".py", ".txt", ".md"]
26
+ )
27
+
28
+ @app.function(
29
+ image=image,
30
+ mounts=[code_mount],
31
+ allow_concurrent_inputs=100,
32
+ timeout=3600, # 1 hour timeout for long research tasks
33
+ secrets=[
34
+ modal.Secret.from_name("research-copilot-secrets"), # API keys
35
+ ]
36
+ )
37
+ @modal.web_server(port=7860, startup_timeout=60)
38
+ def run_gradio_app():
39
+ """Run the ResearchCopilot Gradio application"""
40
+ import sys
41
+ sys.path.append("/app")
42
+
43
+ # Import and run the main application
44
+ from ResearchCopilot.research_copilot import create_interface
45
+
46
+ app = create_interface()
47
+ app.launch(
48
+ server_name="0.0.0.0",
49
+ server_port=7860,
50
+ share=False, # Modal handles the sharing
51
+ show_error=True,
52
+ enable_queue=True
53
+ )
54
+
55
+ # Enhanced retriever with real API integrations
56
+ @app.function(
57
+ image=image,
58
+ secrets=[modal.Secret.from_name("research-copilot-secrets")],
59
+ timeout=300
60
+ )
61
+ async def search_perplexity(query: str, num_results: int = 5):
62
+ """Search using Perplexity API"""
63
+ import httpx
64
+ import os
65
+
66
+ api_key = os.getenv("PERPLEXITY_API_KEY")
67
+ if not api_key:
68
+ # Return mock data if no API key
69
+ return {
70
+ "results": [
71
+ {
72
+ "title": f"Mock Result for: {query}",
73
+ "url": "https://example.com/mock",
74
+ "snippet": f"This is a mock result for the query: {query}",
75
+ "source_type": "web"
76
+ }
77
+ ]
78
+ }
79
+
80
+ async with httpx.AsyncClient() as client:
81
+ try:
82
+ response = await client.post(
83
+ "https://api.perplexity.ai/chat/completions",
84
+ headers={
85
+ "Authorization": f"Bearer {api_key}",
86
+ "Content-Type": "application/json"
87
+ },
88
+ json={
89
+ "model": "llama-3.1-sonar-small-128k-online",
90
+ "messages": [
91
+ {"role": "user", "content": f"Search for: {query}"}
92
+ ],
93
+ "max_tokens": 1000,
94
+ "temperature": 0.2,
95
+ "return_citations": True
96
+ }
97
+ )
98
+
99
+ if response.status_code == 200:
100
+ data = response.json()
101
+ return {"results": data.get("choices", [{}])[0].get("message", {}).get("content", "")}
102
+ else:
103
+ return {"error": f"API error: {response.status_code}"}
104
+
105
+ except Exception as e:
106
+ return {"error": str(e)}
107
+
108
+ @app.function(
109
+ image=image,
110
+ secrets=[modal.Secret.from_name("research-copilot-secrets")],
111
+ timeout=300
112
+ )
113
+ async def search_google(query: str, num_results: int = 10):
114
+ """Search using Google Custom Search API"""
115
+ import httpx
116
+ import os
117
+
118
+ api_key = os.getenv("GOOGLE_API_KEY")
119
+ search_engine_id = os.getenv("GOOGLE_SEARCH_ENGINE_ID")
120
+
121
+ if not api_key or not search_engine_id:
122
+ # Return mock data if no API keys
123
+ return {
124
+ "results": [
125
+ {
126
+ "title": f"Google Search: {query}",
127
+ "url": "https://example.com/google-mock",
128
+ "snippet": f"Mock Google search result for: {query}",
129
+ "source_type": "web"
130
+ }
131
+ ]
132
+ }
133
+
134
+ async with httpx.AsyncClient() as client:
135
+ try:
136
+ response = await client.get(
137
+ "https://www.googleapis.com/customsearch/v1",
138
+ params={
139
+ "key": api_key,
140
+ "cx": search_engine_id,
141
+ "q": query,
142
+ "num": min(num_results, 10)
143
+ }
144
+ )
145
+
146
+ if response.status_code == 200:
147
+ data = response.json()
148
+ results = []
149
+ for item in data.get("items", []):
150
+ results.append({
151
+ "title": item.get("title", ""),
152
+ "url": item.get("link", ""),
153
+ "snippet": item.get("snippet", ""),
154
+ "source_type": "web"
155
+ })
156
+ return {"results": results}
157
+ else:
158
+ return {"error": f"Google API error: {response.status_code}"}
159
+
160
+ except Exception as e:
161
+ return {"error": str(e)}
162
+
163
+ @app.function(
164
+ image=image,
165
+ secrets=[modal.Secret.from_name("research-copilot-secrets")],
166
+ timeout=600
167
+ )
168
+ async def summarize_with_claude(content: str, context: str = ""):
169
+ """Summarize content using Claude API"""
170
+ import httpx
171
+ import os
172
+
173
+ api_key = os.getenv("ANTHROPIC_API_KEY")
174
+ if not api_key:
175
+ # Return mock summary if no API key
176
+ return {
177
+ "summary": f"Mock summary of content: {content[:100]}...",
178
+ "key_points": ["Point 1", "Point 2", "Point 3"]
179
+ }
180
+
181
+ async with httpx.AsyncClient() as client:
182
+ try:
183
+ response = await client.post(
184
+ "https://api.anthropic.com/v1/messages",
185
+ headers={
186
+ "x-api-key": api_key,
187
+ "Content-Type": "application/json",
188
+ "anthropic-version": "2023-06-01"
189
+ },
190
+ json={
191
+ "model": "claude-3-sonnet-20240229",
192
+ "max_tokens": 1000,
193
+ "messages": [
194
+ {
195
+ "role": "user",
196
+ "content": f"Summarize this content and extract key points:\n\nContext: {context}\n\nContent: {content}"
197
+ }
198
+ ]
199
+ }
200
+ )
201
+
202
+ if response.status_code == 200:
203
+ data = response.json()
204
+ content_text = data.get("content", [{}])[0].get("text", "")
205
+ return {
206
+ "summary": content_text,
207
+ "key_points": ["AI-generated summary", "Professional analysis", "Comprehensive overview"]
208
+ }
209
+ else:
210
+ return {"error": f"Claude API error: {response.status_code}"}
211
+
212
+ except Exception as e:
213
+ return {"error": str(e)}
214
+
215
+ if __name__ == "__main__":
216
+ # For local development
217
+ import subprocess
218
+ subprocess.run(["python", "research_copilot.py"])
requirements.txt ADDED
@@ -0,0 +1,18 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # ResearchCopilot Dependencies
2
+ gradio>=4.0.0
3
+ modal>=0.60.0
4
+ aiohttp>=3.8.0
5
+ httpx>=0.24.0
6
+ asyncio-throttle>=1.0.0
7
+ python-dotenv>=1.0.0
8
+ beautifulsoup4>=4.12.0
9
+ lxml>=4.9.0
10
+ requests>=2.31.0
11
+ openai>=1.0.0
12
+ anthropic>=0.20.0
13
+ pydantic>=2.0.0
14
+ tenacity>=8.2.0
15
+ typing-extensions>=4.5.0
16
+ dataclasses-json>=0.6.0
17
+ urllib3>=2.0.0
18
+ certifi>=2023.7.22
research_copilot.py ADDED
@@ -0,0 +1,911 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # ResearchCopilot - Multi-Agent Research System
2
+ # Track 3: Agentic Demo Showcase - Gradio MCP Hackathon 2025
3
+
4
+ import gradio as gr
5
+ import asyncio
6
+ import json
7
+ import time
8
+ import os
9
+ from datetime import datetime
10
+ from typing import Dict, List, Optional, Tuple
11
+ from dataclasses import dataclass, asdict
12
+ from enum import Enum
13
+ import logging
14
+ import re
15
+ from abc import ABC, abstractmethod
16
+
17
+ # Load environment variables from .env file
18
+ try:
19
+ from dotenv import load_dotenv
20
+ load_dotenv()
21
+ print("✅ Environment variables loaded from .env file")
22
+ except ImportError:
23
+ print("⚠️ python-dotenv not installed. Install with: pip install python-dotenv")
24
+ except Exception as e:
25
+ print(f"⚠️ Could not load .env file: {e}")
26
+
27
+ # Import enhanced agents with real API integrations
28
+ try:
29
+ from ResearchCopilot.enhanced_agents import EnhancedRetrieverAgent, EnhancedSummarizerAgent, EnhancedCitationAgent, SearchResult
30
+ ENHANCED_AGENTS_AVAILABLE = True
31
+ print("✅ Enhanced agents loaded successfully")
32
+ except ImportError:
33
+ print("❌ Enhanced agents not found - using basic agents with mock data")
34
+ ENHANCED_AGENTS_AVAILABLE = False
35
+
36
+ # Configure logging
37
+ logging.basicConfig(level=logging.INFO)
38
+ logger = logging.getLogger(__name__)
39
+
40
+ # Debug: Check if API keys are loaded
41
+ print("\n🔑 API Key Status:")
42
+ print(f"Perplexity API: {'✅ Loaded' if os.getenv('PERPLEXITY_API_KEY') else '❌ Missing'}")
43
+ print(f"Google API: {'✅ Loaded' if os.getenv('GOOGLE_API_KEY') else '❌ Missing'}")
44
+ print(f"Google Search ID: {'✅ Loaded' if os.getenv('GOOGLE_SEARCH_ENGINE_ID') else '❌ Missing'}")
45
+ print(f"Claude API: {'✅ Loaded' if os.getenv('ANTHROPIC_API_KEY') else '❌ Missing'}")
46
+ print(f"OpenAI API: {'✅ Loaded (fallback)' if os.getenv('OPENAI_API_KEY') else '❌ Missing'}")
47
+ print("=" * 50)
48
+
49
+ class AgentStatus(Enum):
50
+ IDLE = "idle"
51
+ THINKING = "thinking"
52
+ WORKING = "working"
53
+ COMPLETED = "completed"
54
+ ERROR = "error"
55
+
56
+ @dataclass
57
+ class ResearchTask:
58
+ id: str
59
+ description: str
60
+ priority: int
61
+ dependencies: List[str]
62
+ status: str = "pending"
63
+ results: Optional[Dict] = None
64
+ created_at: str = None
65
+
66
+ def __post_init__(self):
67
+ if self.created_at is None:
68
+ self.created_at = datetime.now().isoformat()
69
+
70
+ @dataclass
71
+ class AgentMessage:
72
+ agent_id: str
73
+ message: str
74
+ timestamp: str
75
+ status: AgentStatus
76
+ data: Optional[Dict] = None
77
+
78
+ class BaseAgent(ABC):
79
+ def __init__(self, agent_id: str, name: str):
80
+ self.agent_id = agent_id
81
+ self.name = name
82
+ self.status = AgentStatus.IDLE
83
+ self.messages = []
84
+
85
+ def log_message(self, message: str, data: Optional[Dict] = None):
86
+ msg = AgentMessage(
87
+ agent_id=self.agent_id,
88
+ message=message,
89
+ timestamp=datetime.now().isoformat(),
90
+ status=self.status,
91
+ data=data
92
+ )
93
+ self.messages.append(msg)
94
+ logger.info(f"[{self.name}] {message}")
95
+ return msg
96
+
97
+ @abstractmethod
98
+ async def process(self, input_data: Dict) -> Dict:
99
+ pass
100
+
101
+ class PlannerAgent(BaseAgent):
102
+ def __init__(self):
103
+ super().__init__("planner", "Research Planner")
104
+
105
+ async def process(self, input_data: Dict) -> Dict:
106
+ self.status = AgentStatus.THINKING
107
+ query = input_data.get("query", "")
108
+
109
+ self.log_message(f"Analyzing research query: {query}")
110
+ await asyncio.sleep(1) # Simulate thinking time
111
+
112
+ self.status = AgentStatus.WORKING
113
+
114
+ # Simulate intelligent task breakdown
115
+ tasks = self._create_research_plan(query)
116
+
117
+ self.log_message(f"Created research plan with {len(tasks)} tasks")
118
+
119
+ self.status = AgentStatus.COMPLETED
120
+
121
+ return {
122
+ "tasks": tasks,
123
+ "strategy": self._generate_strategy(query),
124
+ "estimated_time": len(tasks) * 2,
125
+ "complexity": self._assess_complexity(query)
126
+ }
127
+
128
+ def _create_research_plan(self, query: str) -> List[ResearchTask]:
129
+ # Intelligent task decomposition based on query analysis
130
+ tasks = []
131
+
132
+ # Core research task
133
+ tasks.append(ResearchTask(
134
+ id="core_search",
135
+ description=f"Primary research on: {query}",
136
+ priority=1,
137
+ dependencies=[]
138
+ ))
139
+
140
+ # If query mentions specific domains, add specialized searches
141
+ if any(term in query.lower() for term in ["academic", "paper", "study", "research"]):
142
+ tasks.append(ResearchTask(
143
+ id="academic_search",
144
+ description="Search academic databases and papers",
145
+ priority=2,
146
+ dependencies=["core_search"]
147
+ ))
148
+
149
+ # If query is about recent events, add news search
150
+ if any(term in query.lower() for term in ["recent", "latest", "current", "2024", "2025"]):
151
+ tasks.append(ResearchTask(
152
+ id="news_search",
153
+ description="Search for recent news and updates",
154
+ priority=2,
155
+ dependencies=["core_search"]
156
+ ))
157
+
158
+ # Always add background context
159
+ tasks.append(ResearchTask(
160
+ id="context_search",
161
+ description="Gather background context and definitions",
162
+ priority=3,
163
+ dependencies=["core_search"]
164
+ ))
165
+
166
+ return tasks
167
+
168
+ def _generate_strategy(self, query: str) -> str:
169
+ if len(query.split()) < 5:
170
+ return "Focused search strategy for specific topic"
171
+ elif any(word in query.lower() for word in ["compare", "vs", "versus", "difference"]):
172
+ return "Comparative analysis strategy"
173
+ elif "how" in query.lower():
174
+ return "Process-oriented research strategy"
175
+ else:
176
+ return "Comprehensive exploratory strategy"
177
+
178
+ def _assess_complexity(self, query: str) -> str:
179
+ word_count = len(query.split())
180
+ if word_count < 5:
181
+ return "Low"
182
+ elif word_count < 10:
183
+ return "Medium"
184
+ else:
185
+ return "High"
186
+
187
+ class RetrieverAgent(BaseAgent):
188
+ def __init__(self):
189
+ super().__init__("retriever", "Information Retriever")
190
+ self.search_apis = ["perplexity", "google", "academic"]
191
+ # Use enhanced agent if available
192
+ if ENHANCED_AGENTS_AVAILABLE:
193
+ self.enhanced_agent = None
194
+
195
+ async def process(self, input_data: Dict) -> Dict:
196
+ self.status = AgentStatus.THINKING
197
+ task = input_data.get("task")
198
+
199
+ self.log_message(f"Processing retrieval task: {task.description}")
200
+
201
+ self.status = AgentStatus.WORKING
202
+
203
+ # Use enhanced agents with real APIs if available
204
+ if ENHANCED_AGENTS_AVAILABLE:
205
+ try:
206
+ async with EnhancedRetrieverAgent() as enhanced_retriever:
207
+ # Try real API search first
208
+ if "academic" in task.id:
209
+ sources = await enhanced_retriever.search_academic(task.description, 5)
210
+ elif "news" in task.id:
211
+ sources = await enhanced_retriever.search_google(f"recent news {task.description}", 5)
212
+ else:
213
+ # Use Perplexity for main searches
214
+ sources = await enhanced_retriever.search_perplexity(task.description, 5)
215
+ if not sources: # Fallback to Google
216
+ sources = await enhanced_retriever.search_google(task.description, 5)
217
+
218
+ if sources:
219
+ self.log_message(f"Retrieved {len(sources)} sources using real APIs")
220
+ self.status = AgentStatus.COMPLETED
221
+
222
+ # Convert SearchResult objects to dict format
223
+ results = []
224
+ for source in sources:
225
+ results.append({
226
+ "title": source.title,
227
+ "url": source.url,
228
+ "snippet": source.snippet,
229
+ "source_type": source.source_type,
230
+ "relevance": source.relevance
231
+ })
232
+
233
+ return {
234
+ "sources": results,
235
+ "search_strategy": self._get_search_strategy(task),
236
+ "confidence": self._calculate_confidence(results)
237
+ }
238
+ except Exception as e:
239
+ self.log_message(f"API search failed, using mock data: {str(e)}")
240
+
241
+ # Fallback to mock data
242
+ results = await self._perform_searches(task)
243
+
244
+ self.log_message(f"Retrieved {len(results)} sources (mock data)")
245
+
246
+ self.status = AgentStatus.COMPLETED
247
+
248
+ return {
249
+ "sources": results,
250
+ "search_strategy": self._get_search_strategy(task),
251
+ "confidence": self._calculate_confidence(results)
252
+ }
253
+
254
+ async def _perform_searches(self, task: ResearchTask) -> List[Dict]:
255
+ # Simulate different search strategies based on task type
256
+ await asyncio.sleep(2) # Simulate API call time
257
+
258
+ # Mock search results with realistic structure
259
+ results = []
260
+
261
+ if "academic" in task.id:
262
+ results.extend([
263
+ {
264
+ "title": "Academic Paper on Topic",
265
+ "url": "https://arxiv.org/paper/123",
266
+ "snippet": "Comprehensive study showing key findings...",
267
+ "source_type": "academic",
268
+ "relevance": 0.95
269
+ },
270
+ {
271
+ "title": "Research Publication",
272
+ "url": "https://journals.example.com/article/456",
273
+ "snippet": "Peer-reviewed research demonstrating...",
274
+ "source_type": "academic",
275
+ "relevance": 0.88
276
+ }
277
+ ])
278
+
279
+ if "news" in task.id:
280
+ results.extend([
281
+ {
282
+ "title": "Recent Development in Field",
283
+ "url": "https://news.example.com/article/789",
284
+ "snippet": "Latest updates show significant progress...",
285
+ "source_type": "news",
286
+ "relevance": 0.82
287
+ }
288
+ ])
289
+
290
+ # Always add some general results
291
+ results.extend([
292
+ {
293
+ "title": "Comprehensive Overview",
294
+ "url": "https://example.com/overview",
295
+ "snippet": "Detailed analysis covering multiple aspects...",
296
+ "source_type": "general",
297
+ "relevance": 0.79
298
+ },
299
+ {
300
+ "title": "Expert Analysis",
301
+ "url": "https://expert.example.com/analysis",
302
+ "snippet": "Professional insights and recommendations...",
303
+ "source_type": "expert",
304
+ "relevance": 0.85
305
+ }
306
+ ])
307
+
308
+ return results
309
+
310
+ def _get_search_strategy(self, task: ResearchTask) -> str:
311
+ if "academic" in task.id:
312
+ return "Academic database search with peer-review filter"
313
+ elif "news" in task.id:
314
+ return "Recent news aggregation with date filtering"
315
+ else:
316
+ return "Multi-source comprehensive search"
317
+
318
+ def _calculate_confidence(self, results: List[Dict]) -> float:
319
+ if not results:
320
+ return 0.0
321
+
322
+ avg_relevance = sum(r.get("relevance", 0) for r in results) / len(results)
323
+ source_diversity = len(set(r.get("source_type") for r in results))
324
+
325
+ return min(1.0, avg_relevance * 0.7 + (source_diversity / 5) * 0.3)
326
+
327
+ class SummarizerAgent(BaseAgent):
328
+ def __init__(self):
329
+ super().__init__("summarizer", "Content Summarizer")
330
+
331
+ async def process(self, input_data: Dict) -> Dict:
332
+ self.status = AgentStatus.THINKING
333
+ sources = input_data.get("sources", [])
334
+
335
+ self.log_message(f"Summarizing {len(sources)} sources")
336
+
337
+ self.status = AgentStatus.WORKING
338
+
339
+ # Use enhanced agents with real APIs if available
340
+ if ENHANCED_AGENTS_AVAILABLE:
341
+ try:
342
+ # Create enhanced summarizer (no async context manager needed)
343
+ enhanced_summarizer = EnhancedSummarizerAgent()
344
+
345
+ # Convert dict sources to SearchResult objects
346
+ search_results = []
347
+ for source in sources:
348
+ search_results.append(SearchResult(
349
+ title=source.get("title", ""),
350
+ url=source.get("url", ""),
351
+ snippet=source.get("snippet", ""),
352
+ source_type=source.get("source_type", "web"),
353
+ relevance=source.get("relevance", 0.5)
354
+ ))
355
+
356
+ # Use synchronous call (KarmaCheck style)
357
+ summary_result = enhanced_summarizer.summarize_with_claude(
358
+ search_results,
359
+ "Research query analysis"
360
+ )
361
+
362
+ if summary_result and "summary" in summary_result:
363
+ # Get the actual API used from the result
364
+ api_used = summary_result.get("api_used", "AI API")
365
+ self.log_message(f"Summary generated using {api_used}")
366
+ self.status = AgentStatus.COMPLETED
367
+ return summary_result
368
+
369
+ except Exception as e:
370
+ self.log_message(f"API summarization failed, using mock summary: {str(e)}")
371
+
372
+ # Fallback to mock summary
373
+ await asyncio.sleep(2) # Simulate processing time
374
+
375
+ summary = self._generate_summary(sources)
376
+ key_points = self._extract_key_points(sources)
377
+
378
+ self.log_message("Summary generation completed (mock data)")
379
+
380
+ self.status = AgentStatus.COMPLETED
381
+
382
+ return {
383
+ "summary": summary,
384
+ "key_points": key_points,
385
+ "word_count": len(summary.split()),
386
+ "coverage_score": self._calculate_coverage(sources)
387
+ }
388
+
389
+ def _generate_summary(self, sources: List[Dict]) -> str:
390
+ # Simulate intelligent summarization
391
+ if not sources:
392
+ return "No sources available for summarization."
393
+
394
+ summary_parts = []
395
+
396
+ # Group sources by type
397
+ academic_sources = [s for s in sources if s.get("source_type") == "academic"]
398
+ news_sources = [s for s in sources if s.get("source_type") == "news"]
399
+ general_sources = [s for s in sources if s.get("source_type") == "general"]
400
+
401
+ if academic_sources:
402
+ summary_parts.append(
403
+ "Academic research indicates significant developments in this field. "
404
+ "Peer-reviewed studies demonstrate consistent findings across multiple "
405
+ "research groups, with high confidence in the methodological approaches used."
406
+ )
407
+
408
+ if news_sources:
409
+ summary_parts.append(
410
+ "Recent developments show ongoing progress and public interest. "
411
+ "Current trends suggest continued evolution in this area with "
412
+ "practical implications for stakeholders."
413
+ )
414
+
415
+ if general_sources:
416
+ summary_parts.append(
417
+ "Comprehensive analysis reveals multiple perspectives and approaches. "
418
+ "Expert opinions converge on key principles while acknowledging "
419
+ "areas that require further investigation."
420
+ )
421
+
422
+ return " ".join(summary_parts)
423
+
424
+ def _extract_key_points(self, sources: List[Dict]) -> List[str]:
425
+ key_points = []
426
+
427
+ if any(s.get("source_type") == "academic" for s in sources):
428
+ key_points.append("Peer-reviewed research supports main conclusions")
429
+
430
+ if any(s.get("relevance", 0) > 0.9 for s in sources):
431
+ key_points.append("High-relevance sources identified")
432
+
433
+ if len(sources) > 3:
434
+ key_points.append("Multiple independent sources confirm findings")
435
+
436
+ key_points.extend([
437
+ "Cross-referenced information for accuracy",
438
+ "Balanced perspective from diverse sources",
439
+ "Current information reflects latest developments"
440
+ ])
441
+
442
+ return key_points
443
+
444
+ def _calculate_coverage(self, sources: List[Dict]) -> float:
445
+ if not sources:
446
+ return 0.0
447
+
448
+ source_types = set(s.get("source_type") for s in sources)
449
+ high_relevance = sum(1 for s in sources if s.get("relevance", 0) > 0.8)
450
+
451
+ coverage = (len(source_types) / 4) * 0.5 + (high_relevance / len(sources)) * 0.5
452
+ return min(1.0, coverage)
453
+
454
+ class CitationAgent(BaseAgent):
455
+ def __init__(self):
456
+ super().__init__("citation", "Citation Generator")
457
+
458
+ async def process(self, input_data: Dict) -> Dict:
459
+ self.status = AgentStatus.THINKING
460
+ sources = input_data.get("sources", [])
461
+
462
+ self.log_message(f"Generating citations for {len(sources)} sources")
463
+
464
+ self.status = AgentStatus.WORKING
465
+
466
+ # Use enhanced citation agent if available
467
+ if ENHANCED_AGENTS_AVAILABLE:
468
+ try:
469
+ enhanced_citation = EnhancedCitationAgent()
470
+
471
+ # Convert dict sources to SearchResult objects
472
+ search_results = []
473
+ for source in sources:
474
+ search_results.append(SearchResult(
475
+ title=source.get("title", ""),
476
+ url=source.get("url", ""),
477
+ snippet=source.get("snippet", ""),
478
+ source_type=source.get("source_type", "web"),
479
+ relevance=source.get("relevance", 0.5)
480
+ ))
481
+
482
+ citation_result = enhanced_citation.generate_citations(search_results)
483
+
484
+ if citation_result:
485
+ self.log_message("Citations generated with multiple formats")
486
+ self.status = AgentStatus.COMPLETED
487
+ return citation_result
488
+
489
+ except Exception as e:
490
+ self.log_message(f"Enhanced citation failed, using basic: {str(e)}")
491
+
492
+ # Fallback to basic citation
493
+ await asyncio.sleep(1) # Simulate processing time
494
+
495
+ citations = self._generate_citations(sources)
496
+ bibliography = self._create_bibliography(sources)
497
+
498
+ self.log_message("Citation generation completed")
499
+
500
+ self.status = AgentStatus.COMPLETED
501
+
502
+ return {
503
+ "citations": citations,
504
+ "bibliography": bibliography,
505
+ "citation_count": len(citations),
506
+ "formats": ["APA", "MLA", "Chicago"]
507
+ }
508
+
509
+ def _generate_citations(self, sources: List[Dict]) -> List[Dict]:
510
+ citations = []
511
+
512
+ for i, source in enumerate(sources, 1):
513
+ citation = {
514
+ "id": i,
515
+ "apa": self._format_apa(source),
516
+ "mla": self._format_mla(source),
517
+ "chicago": self._format_chicago(source),
518
+ "source": source
519
+ }
520
+ citations.append(citation)
521
+
522
+ return citations
523
+
524
+ def _format_apa(self, source: Dict) -> str:
525
+ title = source.get("title", "Unknown Title")
526
+ url = source.get("url", "")
527
+ return f"{title}. Retrieved from {url}"
528
+
529
+ def _format_mla(self, source: Dict) -> str:
530
+ title = source.get("title", "Unknown Title")
531
+ url = source.get("url", "")
532
+ return f'"{title}." Web. {datetime.now().strftime("%d %b %Y")}. <{url}>'
533
+
534
+ def _format_chicago(self, source: Dict) -> str:
535
+ title = source.get("title", "Unknown Title")
536
+ url = source.get("url", "")
537
+ return f'"{title}." Accessed {datetime.now().strftime("%B %d, %Y")}. {url}.'
538
+
539
+ def _create_bibliography(self, sources: List[Dict]) -> str:
540
+ if not sources:
541
+ return "No sources to cite."
542
+
543
+ bib_entries = []
544
+ for source in sources:
545
+ bib_entries.append(self._format_apa(source))
546
+
547
+ return "\n\n".join(bib_entries)
548
+
549
+ class ResearchOrchestrator:
550
+ def __init__(self):
551
+ self.planner = PlannerAgent()
552
+ self.retriever = RetrieverAgent()
553
+ self.summarizer = SummarizerAgent()
554
+ self.citation_gen = CitationAgent()
555
+ self.research_state = {}
556
+ self.activity_log = []
557
+
558
+ async def conduct_research(self, query: str, progress_callback=None) -> Dict:
559
+ """Main research orchestration method"""
560
+
561
+ self.activity_log = []
562
+ self.research_state = {"query": query, "start_time": datetime.now().isoformat()}
563
+
564
+ try:
565
+ # Step 1: Planning
566
+ if progress_callback:
567
+ progress_callback("🎯 Planning research approach...", 10)
568
+
569
+ plan_result = await self.planner.process({"query": query})
570
+ self.research_state["plan"] = plan_result
571
+ self._log_activity("Planning completed", self.planner.messages[-1])
572
+
573
+ # Step 2: Information Retrieval
574
+ if progress_callback:
575
+ progress_callback("🔍 Gathering information...", 30)
576
+
577
+ all_sources = []
578
+ tasks = plan_result["tasks"]
579
+
580
+ for i, task in enumerate(tasks):
581
+ if progress_callback:
582
+ progress_callback(f"🔍 Processing: {task.description}", 30 + (i * 20))
583
+
584
+ retrieval_result = await self.retriever.process({"task": task})
585
+ all_sources.extend(retrieval_result["sources"])
586
+ self._log_activity(f"Retrieved sources for: {task.description}",
587
+ self.retriever.messages[-1])
588
+
589
+ self.research_state["sources"] = all_sources
590
+
591
+ # Step 3: Summarization
592
+ if progress_callback:
593
+ progress_callback("📝 Analyzing and summarizing...", 70)
594
+
595
+ summary_result = await self.summarizer.process({"sources": all_sources})
596
+ self.research_state["summary"] = summary_result
597
+ self._log_activity("Summarization completed", self.summarizer.messages[-1])
598
+
599
+ # Step 4: Citation Generation
600
+ if progress_callback:
601
+ progress_callback("📚 Generating citations...", 90)
602
+
603
+ citation_result = await self.citation_gen.process({"sources": all_sources})
604
+ self.research_state["citations"] = citation_result
605
+ self._log_activity("Citations generated", self.citation_gen.messages[-1])
606
+
607
+ if progress_callback:
608
+ progress_callback("✅ Research completed!", 100)
609
+
610
+ self.research_state["completion_time"] = datetime.now().isoformat()
611
+ self.research_state["status"] = "completed"
612
+
613
+ return self.research_state
614
+
615
+ except Exception as e:
616
+ logger.error(f"Research failed: {str(e)}")
617
+ self.research_state["status"] = "error"
618
+ self.research_state["error"] = str(e)
619
+ return self.research_state
620
+
621
+ def _log_activity(self, description: str, agent_message: AgentMessage):
622
+ activity = {
623
+ "timestamp": datetime.now().isoformat(),
624
+ "description": description,
625
+ "agent": agent_message.agent_id,
626
+ "details": agent_message.message
627
+ }
628
+ self.activity_log.append(activity)
629
+
630
+ def get_activity_log(self) -> List[Dict]:
631
+ return self.activity_log
632
+
633
+ # Global orchestrator instance
634
+ orchestrator = ResearchOrchestrator()
635
+
636
+ def format_research_results(research_state: Dict) -> Tuple[str, str, str, str]:
637
+ """Format research results for Gradio display"""
638
+
639
+ if research_state.get("status") == "error":
640
+ error_msg = f"❌ Research failed: {research_state.get('error', 'Unknown error')}"
641
+ return error_msg, "", "", ""
642
+
643
+ if research_state.get("status") != "completed":
644
+ return "Research in progress...", "", "", ""
645
+
646
+ # Format summary
647
+ summary_data = research_state.get("summary", {})
648
+ summary_text = f"""# Research Summary
649
+
650
+ {summary_data.get('summary', 'No summary available')}
651
+
652
+ ## Key Findings
653
+ """
654
+
655
+ for point in summary_data.get('key_points', []):
656
+ summary_text += f"• {point}\n"
657
+
658
+ summary_text += f"""
659
+ ## Research Metrics
660
+ - Sources analyzed: {len(research_state.get('sources', []))}
661
+ - Summary length: {summary_data.get('word_count', 0)} words
662
+ - Coverage score: {summary_data.get('coverage_score', 0):.2f}
663
+ """
664
+
665
+ # Format sources
666
+ sources = research_state.get("sources", [])
667
+ sources_text = "# Sources Found\n\n"
668
+
669
+ for i, source in enumerate(sources, 1):
670
+ sources_text += f"""## {i}. {source.get('title', 'Unknown Title')}
671
+ **URL:** {source.get('url', 'N/A')}
672
+ **Type:** {source.get('source_type', 'Unknown')}
673
+ **Relevance:** {source.get('relevance', 0):.2f}
674
+ **Summary:** {source.get('snippet', 'No summary available')}
675
+
676
+ ---
677
+
678
+ """
679
+
680
+ # Format citations
681
+ citations_data = research_state.get("citations", {})
682
+ citations_text = ""
683
+
684
+ # Check if we have citations data
685
+ if citations_data and isinstance(citations_data, dict):
686
+ bibliography = citations_data.get('bibliography')
687
+ if bibliography and bibliography.strip():
688
+ citations_text += bibliography
689
+ else:
690
+ # Fallback: create bibliography from sources if citations failed
691
+ sources = research_state.get("sources", [])
692
+ if sources:
693
+ citations_text += "## Sources Referenced:\n\n"
694
+ for i, source in enumerate(sources, 1):
695
+ title = source.get("title", "Unknown Title")
696
+ url = source.get("url", "")
697
+ source_type = source.get("source_type", "web")
698
+
699
+ citations_text += f"**[{i}]** {title} \n"
700
+ citations_text += f"*Source:* {source_type.title()} \n"
701
+ citations_text += f"*URL:* {url} \n\n"
702
+ else:
703
+ citations_text += "No sources available for citation."
704
+ else:
705
+ # Create citations from sources directly
706
+ sources = research_state.get("sources", [])
707
+ if sources:
708
+ citations_text += "## Research Sources:\n\n"
709
+ for i, source in enumerate(sources, 1):
710
+ title = source.get("title", "Unknown Title")
711
+ url = source.get("url", "")
712
+ source_type = source.get("source_type", "web")
713
+ relevance = source.get("relevance", 0)
714
+
715
+ citations_text += f"**{i}.** {title} \n"
716
+ citations_text += f"**Type:** {source_type.title()} | **Relevance:** {relevance:.2f} \n"
717
+ citations_text += f"**URL:** {url} \n\n"
718
+ else:
719
+ citations_text += "No sources available for citation."
720
+
721
+ # Format activity log
722
+ activity_text = "# Research Process Log\n\n"
723
+ for activity in orchestrator.get_activity_log():
724
+ timestamp = datetime.fromisoformat(activity['timestamp']).strftime("%H:%M:%S")
725
+ activity_text += f"**{timestamp}** - {activity['description']}\n"
726
+ activity_text += f"*{activity['details']}*\n\n"
727
+
728
+ return summary_text, sources_text, citations_text, activity_text
729
+
730
+ async def conduct_research_async(query: str, progress=gr.Progress()) -> Tuple[str, str, str, str]:
731
+ """Async wrapper for research with progress updates"""
732
+
733
+ def update_progress(message: str, percent: int):
734
+ progress(percent/100, desc=message)
735
+
736
+ research_result = await orchestrator.conduct_research(query, update_progress)
737
+ return format_research_results(research_result)
738
+
739
+ def conduct_research_sync(query: str, progress=gr.Progress()) -> Tuple[str, str, str, str]:
740
+ """Synchronous wrapper for Gradio"""
741
+ if not query.strip():
742
+ return "Please enter a research query.", "", "", ""
743
+
744
+ # Run async function in event loop
745
+ try:
746
+ loop = asyncio.get_event_loop()
747
+ except RuntimeError:
748
+ loop = asyncio.new_event_loop()
749
+ asyncio.set_event_loop(loop)
750
+
751
+ return loop.run_until_complete(conduct_research_async(query, progress))
752
+
753
+ def create_interface():
754
+ """Create the Gradio interface"""
755
+
756
+ with gr.Blocks(
757
+ title="ResearchCopilot - Multi-Agent Research System",
758
+ theme=gr.themes.Soft(),
759
+ css="""
760
+ .gradio-container {
761
+ max-width: 1200px !important;
762
+ margin: 0 auto !important;
763
+ }
764
+ .research-header {
765
+ text-align: center;
766
+ background: linear-gradient(135deg, #667eea 0%, #764ba2 100%);
767
+ color: white;
768
+ padding: 2rem;
769
+ border-radius: 10px;
770
+ margin-bottom: 2rem;
771
+ }
772
+ .agent-status {
773
+ background: #ffffff !important;
774
+ border: 2px solid #e0e0e0;
775
+ border-radius: 8px;
776
+ padding: 1.5rem;
777
+ margin: 1rem 0;
778
+ box-shadow: 0 2px 4px rgba(0,0,0,0.1);
779
+ }
780
+ .agent-status h3 {
781
+ color: #2c3e50 !important;
782
+ margin-bottom: 1rem;
783
+ font-size: 1.2rem;
784
+ }
785
+ .agent-status ul {
786
+ color: #2c3e50 !important;
787
+ list-style-type: none;
788
+ padding-left: 0;
789
+ }
790
+ .agent-status li {
791
+ color: #2c3e50 !important;
792
+ margin-bottom: 0.8rem;
793
+ padding: 0.5rem;
794
+ background: #f8f9fa;
795
+ border-radius: 4px;
796
+ border-left: 4px solid #667eea;
797
+ }
798
+ .agent-status strong {
799
+ color: #667eea !important;
800
+ }
801
+ """
802
+ ) as interface:
803
+
804
+ # Header
805
+ gr.HTML("""
806
+ <div class="research-header">
807
+ <h1>🤖 ResearchCopilot</h1>
808
+ <h2>Multi-Agent Research System</h2>
809
+ <p>Powered by AI agents working together to conduct comprehensive research</p>
810
+ <p><em>Track 3: Agentic Demo Showcase - Gradio MCP Hackathon 2025</em></p>
811
+ </div>
812
+ """)
813
+
814
+ # Agent Status Overview
815
+ with gr.Row():
816
+ gr.HTML("""
817
+ <div class="agent-status">
818
+ <h3>🎯 Research Agents</h3>
819
+ <ul>
820
+ <li><strong>Planner Agent:</strong> Breaks down research queries into structured tasks</li>
821
+ <li><strong>Retriever Agent:</strong> Searches multiple sources (Perplexity, Google, Academic)</li>
822
+ <li><strong>Summarizer Agent:</strong> Analyzes and synthesizes information</li>
823
+ <li><strong>Citation Agent:</strong> Generates proper academic citations</li>
824
+ </ul>
825
+ </div>
826
+ """)
827
+
828
+ # Main Interface
829
+ with gr.Row():
830
+ with gr.Column(scale=1):
831
+ query_input = gr.Textbox(
832
+ label="Research Query",
833
+ placeholder="Enter your research question (e.g., 'Latest developments in quantum computing for drug discovery')",
834
+ lines=3
835
+ )
836
+
837
+ research_btn = gr.Button(
838
+ "🚀 Start Research",
839
+ variant="primary",
840
+ size="lg"
841
+ )
842
+
843
+ gr.Examples(
844
+ examples=[
845
+ "Impact of artificial intelligence on healthcare diagnostics",
846
+ "Sustainable energy solutions for urban environments",
847
+ "Recent advances in quantum computing applications",
848
+ "Climate change effects on global food security",
849
+ "Blockchain technology in supply chain management"
850
+ ],
851
+ inputs=query_input,
852
+ label="Example Research Queries"
853
+ )
854
+
855
+ # Results Display
856
+ with gr.Row():
857
+ with gr.Column():
858
+ with gr.Tabs():
859
+ with gr.TabItem("📊 Summary"):
860
+ summary_output = gr.Markdown(
861
+ label="Research Summary",
862
+ value="Enter a research query and click 'Start Research' to begin."
863
+ )
864
+
865
+ with gr.TabItem("📚 Sources"):
866
+ sources_output = gr.Markdown(
867
+ label="Sources Found",
868
+ value="Sources will appear here after research is completed."
869
+ )
870
+
871
+ with gr.TabItem("📖 Citations"):
872
+ citations_output = gr.Markdown(
873
+ label="Citations & Bibliography",
874
+ value="Citations will be generated automatically."
875
+ )
876
+
877
+ with gr.TabItem("🔍 Process Log"):
878
+ activity_output = gr.Markdown(
879
+ label="Agent Activity Log",
880
+ value="Research process will be logged here."
881
+ )
882
+
883
+ # Event Handlers
884
+ research_btn.click(
885
+ fn=conduct_research_sync,
886
+ inputs=[query_input],
887
+ outputs=[summary_output, sources_output, citations_output, activity_output],
888
+ show_progress=True
889
+ )
890
+
891
+ # Footer
892
+ gr.HTML("""
893
+ <div style="text-align: center; margin-top: 2rem; padding: 1.5rem; background: #ffffff; border: 2px solid #e0e0e0; border-radius: 8px; box-shadow: 0 2px 4px rgba(0,0,0,0.1);">
894
+ <p style="color: #2c3e50; font-weight: bold; margin-bottom: 0.5rem;">ResearchCopilot - Demonstrating multi-agent AI collaboration for research tasks</p>
895
+ <p style="color: #667eea; font-size: 0.9rem;">Built for the Gradio Agents & MCP Hackathon 2025 - Track 3: Agentic Demo Showcase</p>
896
+ <p style="color: #7f8c8d; font-size: 0.8rem; margin-top: 0.5rem;">Built with ❤️ using Gradio, Modal, Perplexity API, Claude API, and Multi-Agent Architecture.</p>
897
+ </div>
898
+ """)
899
+
900
+ return interface
901
+
902
+ # Launch the application
903
+ if __name__ == "__main__":
904
+ app = create_interface()
905
+ app.launch(
906
+ share=True, # Creates public URL for sharing
907
+ server_name="127.0.0.1", # Localhost access
908
+ server_port=7860,
909
+ show_error=True,
910
+ inbrowser=True # Automatically opens browser
911
+ )