metadata

title: Web Search MCP
emoji: 🔎
colorFrom: red
colorTo: green
sdk: gradio
sdk_version: 5.36.2
app_file: app.py
pinned: false
short_description: Search and extract web content for LLM ingestion

Web Search MCP Server

A Model Context Protocol (MCP) server that provides web search capabilities to LLMs, allowing them to fetch and extract content from web pages and news articles.

Features

Dual search modes:
- General Search: Get diverse results from blogs, documentation, articles, and more
- News Search: Find fresh news articles and breaking stories from news sources
Real-time web search: Search for any topic with up-to-date results
Content extraction: Automatically extracts main article content, removing ads and boilerplate
Rate limiting: Built-in rate limiting (200 requests/hour) to prevent API abuse
Structured output: Returns formatted content with metadata (title, source, date, URL)
Flexible results: Control the number of results (1-20)

Prerequisites

Serper API Key: Sign up at serper.dev to get your API key
Python 3.8+: Ensure you have Python installed
MCP-compatible LLM client: Such as Claude Desktop, Cursor, or any MCP-enabled application

Installation

Clone or download this repository

Install dependencies:

pip install -r requirements.txt

Or install manually:

pip install "gradio[mcp]" httpx trafilatura python-dateutil limits

Set your Serper API key:

export SERPER_API_KEY="your-api-key-here"

Usage

Starting the MCP Server

python app_mcp.py

The server will start on http://localhost:7860 with the MCP endpoint at:

http://localhost:7860/gradio_api/mcp/sse

Connecting to LLM Clients

Claude Desktop

Add to your claude_desktop_config.json:

{
  "mcpServers": {
    "web-search": {
      "command": "python",
      "args": ["/path/to/app_mcp.py"],
      "env": {
        "SERPER_API_KEY": "your-api-key-here"
      }
    }
  }
}

Direct URL Connection

For clients that support URL-based MCP servers:

Start the server: python app_mcp.py
Connect to: http://localhost:7860/gradio_api/mcp/sse

Tool Documentation

`search_web` Function

Purpose: Search the web for information or fresh news and extract content.

Parameters:

query (str, REQUIRED): The search query
- Examples: "OpenAI news", "climate change 2024", "python tutorial"
num_results (int, OPTIONAL): Number of results to fetch
- Default: 4
- Range: 1-20
- More results provide more context but take longer
search_type (str, OPTIONAL): Type of search to perform
- Default: "search" (general web search)
- Options: "search" or "news"
- Use "news" for fresh, time-sensitive news articles
- Use "search" for general information, documentation, tutorials

Returns: Formatted text containing:

Summary of extraction results
For each article:
- Title
- Source and date
- URL
- Extracted main content

When to use each search type:

Use "news" mode for:
- Breaking news or very recent events
- Time-sensitive information ("today", "this week")
- Current affairs and latest developments
- Press releases and announcements
Use "search" mode for:
- General information and research
- Technical documentation or tutorials
- Historical information
- Diverse perspectives from various sources
- How-to guides and explanations

Example Usage in LLM:

# News mode examples
"Search for breaking news about OpenAI" -> uses news mode
"Find today's stock market updates" -> uses news mode
"Get latest climate change developments" -> uses news mode

# Search mode examples (default)
"Search for Python programming tutorials" -> uses search mode
"Find information about machine learning algorithms" -> uses search mode
"Research historical data about climate change" -> uses search mode

Error Handling

The tool handles various error scenarios:

Missing API key: Clear error message with setup instructions
Rate limiting: Informs when limit is exceeded
Failed extractions: Reports which articles couldn't be extracted
Network errors: Graceful error messages

Testing

You can test the server manually:

Open http://localhost:7860 in your browser
Enter a search query
Adjust the number of results
Click "Search" to see the extracted content

Tips for LLM Usage

Choose the right search type: Use "news" for fresh, breaking news; use "search" for general information
Be specific with queries: More specific queries yield better results
Adjust result count: Use fewer results for quick searches, more for comprehensive research
Check dates: The tool shows article dates for temporal context
Follow up: Use the extracted content to ask follow-up questions

Limitations

Rate limited to 200 requests per hour
Extraction quality depends on website structure
Some websites may block automated access
News mode focuses on recent articles from news sources
Search mode provides diverse results but may include older content

Troubleshooting

"SERPER_API_KEY is not set": Ensure the environment variable is exported
Rate limit errors: Wait before making more requests
No content extracted: Some websites block scrapers; try different queries
Connection errors: Check your internet connection and firewall settings