web-research-agent / README.md
samspeaks5's picture
Update README.md
1684ac5 verified

A newer version of the Gradio SDK is available: 5.42.0

Upgrade
metadata
title: Web Research Agent
emoji: πŸ”
colorFrom: blue
colorTo: indigo
sdk: gradio
sdk_version: 5.26.0
app_file: app.py
pinned: false
python_version: '3.11'

Web Research Agent

A powerful AI research assistant built with CrewAI that conducts comprehensive web research on any topic, providing factual, cited responses through a multi-agent approach.

Overview

This application uses specialized AI agents working together to:

  1. Refine search queries for optimal results
  2. Search the web across multiple search engines
  3. Analyze and verify content
  4. Produce well-structured, factual responses with proper citations

Setup Instructions

Prerequisites

  • Python 3.9+ (recommended: Python 3.11)
  • API keys for:
    • OpenAI (required)
    • Brave Search (recommended)
    • Tavily Search (optional)

Installation

  1. Clone the repository and navigate to the project directory:

    git clone https://github.com/yourusername/web-research-agent.git
    cd web-research-agent
    
  2. Install required dependencies:

    pip install -r requirements.txt
    
  3. Create a .env file in the root directory with your API keys:

    OPENAI_API_KEY=your_openai_api_key
    BRAVE_API_KEY=your_brave_api_key
    TAVILY_API_KEY=your_tavily_api_key
    VERBOSE=False  # Set to True for detailed logging
    

Running the Application

Start the web interface:

python app.py

The application will be available at http://localhost:7860

Common Issues & Troubleshooting

Pydantic/CrewAI Compatibility Issues

If you encounter errors like:

AttributeError: 'property' object has no attribute 'model_fields'

Try the following fixes:

  1. Update to the latest CrewAI version:

    pip install -U crewai crewai-tools
    
  2. If issues persist, temporarily modify the tools/rate_limited_tool.py file to fix compatibility with Pydantic.

Search API Rate Limits

  • Brave Search API has a free tier limit of 1 request per minute and 2,000 requests per month
  • The application implements rate limiting to prevent API throttling
  • Research queries may take several minutes to complete due to these limitations

Gradio Interface Issues

If the interface fails to load or throws errors:

  1. Try installing a specific Gradio version:

    pip install gradio==4.26.0
    
  2. Clear your browser cache to remove cached JavaScript files

  3. Run the headless test script as an alternative:

    python test.py "Your research question"
    

Advanced Usage

Command Line Operation

Test the research engine without the web interface:

python test.py "Your research query here"

Environment Variables

  • OPENAI_API_KEY: Required for language model access
  • BRAVE_API_KEY: Recommended for web search functionality
  • TAVILY_API_KEY: Optional alternative search engine
  • VERBOSE: Set to True/False to control logging detail

Deployment

This project can be deployed to Hugging Face Spaces for web access.

Hugging Face Spaces Deployment

  1. Create a new Space on Hugging Face

    • Go to Hugging Face Spaces
    • Click "Create new Space"
    • Choose a name and select "Gradio" as the SDK
    • Set visibility as needed
  2. Configure Environment Variables

    • In Space settings, add required API keys as secrets
  3. Deploy Code

    git clone https://huggingface.co/spaces/your-username/your-space-name
    cd your-space-name
    cp -r /path/to/web-research-agent/* .
    git add .
    git commit -m "Initial deployment"
    git push
    

Security Notes

  • Never commit your .env file or expose API keys
  • Use repository secrets in Hugging Face Spaces
  • Keep sensitive deployments private

Development Structure

  • app.py: Web interface and session management
  • research_engine.py: Core research orchestration logic
  • agents.py: Agent definitions and configurations
  • tools/: Search and analysis tools
  • test.py: Command-line testing utility