How To Build a News Agent with GPT-OSS, Hugging Face Inference & Gradio

Community Article Published August 14, 2025

OpenAI recently released GPT-OSS 20B and GPT-OSS 120B, two open-weight language models built for strong reasoning, tool use, and developer flexibility. With Hugging Face Inference Providers, you can run them instantly — no GPU setup or local hosting required — and quickly spin up a demo with Gradio.

This cookbook walks you through building a fully-functional, AI-powered news agent powered by GPT-OSS via Hugging Face. Your agent will:

Searches top headlines
Queries specific topics and sites
Optionally fetches full articles for deeper analysis
Synthesizes answers with sources
Runs behind a simple Gradio chat UI -
Logs traces to Langfuse for debugging and iteration

You can see it in action here.

What you'll build
Prerequisites
Quickstart
Configure models and routing
Tools: search, site-search, fetch
Agent loop: tool use and synthesis
UI: Gradio chat app
Observability with Langfuse

1) What you'll build

You're making a news-focused research agent that runs GPT-OSS models via Hugging Face's inference router.

It:

Picks the right tool for the query (RSS, Serper News, site-specific search, or article fetch)
Stops tool use when enough data is gathered
Always returns a final summary with clickable source links

💡 This is designed as a proof of concept, for responsible use --- the system prompt explicitly limits scraping to user-requested analysis.

2) Prerequisites

You'll need:

Python 3.10+
An HF token with inference access
A Serper API key for Google-style search
Optional: Langfuse keys for tracing

Install dependencies:

pip install gradio python-dotenv requests trafilatura openai langfuse

📝 Why .env? Keeps tokens out of code, so you can commit safely.

3) Quickstart

You can clone the repo (git clone https://huggingface.co/spaces/fdaudens/gpt-oss-news-agent) or start from scratch by creating the following files structure:

news-agent-gpt-oss/
│
├── app.py                 # Main application code (agent logic, tools, Gradio UI)
├── requirements.txt       # Python dependencies to install on HF Spaces
├── .env                   # Environment variables (HF token, Serper API key, Langfuse keys)
└── README.md              # (Optional) Documentation / instructions

Fill .env with valid keys:

HF_TOKEN=hf_************************
SERPER_API_KEY=************************
LANGFUSE_PUBLIC_KEY=...
LANGFUSE_SECRET_KEY=...
LANGFUSE_HOST=https://cloud.langfuse.com

Make sure your requirements.txt file contains:

gradio
openai
python-dotenv
trafilatura
langfuse

If you want to run the code locally to test it:
```
python app.py
```
Open the URL printed in your terminal to start chatting.

4) Under the Hood: Configure models and routing

We use Hugging Face’s Router endpoint to call the GPT-OSS models served by Fireworks:

AVAILABLE_MODELS = [
    "openai/gpt-oss-120b:fireworks-ai",
    "openai/gpt-oss-20b:fireworks-ai"
]

# Default model
DEFAULT_MODEL = "openai/gpt-oss-120b:fireworks-ai"

client = OpenAI(
    base_url="https://router.huggingface.co/v1",
    api_key=HF_TOKEN
)

Two models available:

openai/gpt-oss-120b:fireworks-ai (default --- higher reasoning)
openai/gpt-oss-20b:fireworks-ai (faster & cheaper)

⚡ Switch to 20B if you’re aiming for faster responses or need to optimize for cost. In my initial tests, it performed impressively well, so I encourage you to experiment with both models and compare their speed, accuracy, and overall feel.

5) Tools: search, site-search, fetch

Here, we build the tools the agent can access. We declare them in the OpenAI function-calling format and map each one to its corresponding Python function.

a) `fetch_google_news_rss(num=10)`

Gets top headlines from Google News RSS\
Use for: "what's happening today?"\
Returns title, link, pub date, and source

b) `serper_news_search(query, num=5)`

Searches news for a specific topic\
Use for: "AI regulation", "climate change"\
Returns title, link, snippet, date, and source

c) `serper_site_search(query, site, num=5)`

Restricts to a specific domain\
Use for: "site:nytimes.com AI chips"\
Returns title, link, snippet, and favicons

d) `fetch_article(url, max_chars=12000)`

Fetches and extracts full article text\
Only used for deep analysis / quotes\
Uses Trafilatura for clean text extraction

⚠️ The system prompt prevents unnecessary article scraping

Full code for building the tools:

def fetch_google_news_rss(num: int = 10) -> List[Dict[str, Any]]:
    """Fetch general news from Google News RSS feed."""
    try:
        url = "https://news.google.com/rss"
        r = requests.get(url, timeout=30)
        r.raise_for_status()
        
        # Parse RSS XML
        root = ET.fromstring(r.content)
        items = root.findall('.//item')
        
        results = []
        for item in items[:num]:
            title = item.find('title')
            link = item.find('link')
            pub_date = item.find('pubDate')
            source = item.find('source')
            
            results.append({
                "title": title.text if title is not None else "No title",
                "link": link.text if link is not None else "",
                "pub_date": pub_date.text if pub_date is not None else "No date",
                "source": source.text if source is not None else "Google News"
            })
        
        return results
    except Exception as e:
        return {"ok": False, "error": repr(e)}

def serper_news_search(query: str, num: int = 5) -> List[Dict[str, Any]]:
    """Fetch news for a specific topic or query."""
    url = "https://google.serper.dev/news"
    headers = {"X-API-KEY": SERPER_API_KEY, "Content-Type": "application/json"}
    payload = {"q": query, "gl": "us", "hl": "en", "tbs": "qdr:d"}
    r = requests.post(url, headers=headers, json=payload, timeout=30)
    r.raise_for_status()
    data = r.json()
    results = []
    for item in data.get("news", [])[:num]:
        results.append({
            "title": item.get("title"),
            "link": item.get("link"),
            "snippet": item.get("snippet"),
            "date": item.get("date"),  # ISO8601 when available
            "source": item.get("source")
        })
    return results

def serper_site_search(query: str, site: str, num: int = 5) -> List[Dict[str, Any]]:
    """Site restricted web search."""
    url = "https://google.serper.dev/search"
    headers = {"X-API-KEY": SERPER_API_KEY, "Content-Type": "application/json"}
    payload = {"q": f"site:{site} {query}", "gl": "us", "hl": "en"}
    r = requests.post(url, headers=headers, json=payload, timeout=30)
    r.raise_for_status()
    data = r.json()
    results = []
    for item in data.get("organic", [])[:num]:
        results.append({
            "title": item.get("title"),
            "link": item.get("link"),
            "snippet": item.get("snippet"),
            "favicons": item.get("favicons", {})
        })
    return results

def fetch_article(url: str, max_chars: int = 12000) -> Dict[str, Any]:
    """Fetch and extract clean article text with trafilatura."""
    try:
        downloaded = trafilatura.fetch_url(url, timeout=30)
        text = trafilatura.extract(downloaded, include_comments=False) if downloaded else None
        if not text:
            return {"ok": False, "error": "could_not_extract"}
        text = text.strip()
        if len(text) > max_chars:
            text = text[:max_chars] + " ..."
        return {"ok": True, "text": text}
    except Exception as e:
        return {"ok": False, "error": repr(e)}

# OpenAI-style tool specs for function calling
TOOLS = [
    {
        "type": "function",
        "function": {
            "name": "fetch_google_news_rss",
            "description": "Fetch general top headlines from Google News RSS feed. Use this when you want to see what's happening in the world today without a specific topic focus.",
            "parameters": {
                "type": "object",
                "properties": {
                    "num": {"type": "integer", "minimum": 1, "maximum": 20, "description": "Number of news items to fetch"}
                },
                "required": []
            }
        }
    },
    {
        "type": "function",
        "function": {
            "name": "serper_news_search",
            "description": "Search Google News for articles about a specific topic or query. Use this when you need news about particular subjects, companies, or events.",
            "parameters": {
                "type": "object",
                "properties": {
                    "query": {"type": "string"},
                    "num": {"type": "integer", "minimum": 1, "maximum": 20}
                },
                "required": ["query"]
            }
        }
    },
    {
        "type": "function",
        "function": {
            "name": "serper_site_search",
            "description": "Search a specific news domain for relevant articles.",
            "parameters": {
                "type": "object",
                "properties": {
                    "query": {"type": "string"},
                    "site": {"type": "string", "description": "Domain like ft.com or nytimes.com"},
                    "num": {"type": "integer", "minimum": 1, "maximum": 10}
                },
                "required": ["query", "site"]
            }
        }
    },
    {
        "type": "function",
        "function": {
            "name": "fetch_article",
            "description": "Download and extract the main text of an article from a URL. ONLY use this when the user asks specific questions about article content, details, or wants to analyze/quote from particular articles. Do NOT use this for general news summaries or overviews.",
            "parameters": {
                "type": "object",
                "properties": {
                    "url": {"type": "string"},
                    "max_chars": {"type": "integer", "minimum": 1000, "maximum": 60000}
                },
                "required": ["url"]
            }
        }
    }
]

FUNCTION_MAP = {
    "fetch_google_news_rss": fetch_google_news_rss,
    "serper_news_search": serper_news_search,
    "serper_site_search": serper_site_search,
    "fetch_article": fetch_article,
}

6) Agent loop: tool use and synthesis

Goal: Let the model choose tools, run them, and synthesize results into a final answer.

This is the heart of the agent — the decision-making loop where GPT-OSS decides when and how to use tools, processes their outputs, and turns them into a final, well-sourced answer.

The loop works by:

Giving the model a clear system prompt that acts like a playbook for tool selection.
Letting the model autonomously call tools in sequence (via OpenAI function-calling).
Passing the results of each tool call back into the conversation.
Nudging the model to stop calling tools and synthesize once enough data has been gathered.

This design keeps responses efficient, prevents unnecessary tool calls, and ensures every answer includes sources.

Flow:

System message = playbook: clear rules for when to use each tool
On each turn: - Call chat.completions.create(...) with tool_choice="auto"

If tools are requested, run them and append results to messages

After a couple of tool calls, nudge the model: "You now have sufficient information. Please provide your final answer with sources."
Cap at 6 steps to avoid loops
Return answer or friendly error if it fails

Full code for building the agent:

def call_model(messages: List[Dict[str, str]], tools=TOOLS, temperature: float = 0.3, model: str = DEFAULT_MODEL):
    """One step with tool calling support."""
    try:
        return client.chat.completions.create(
            model=model,
            temperature=temperature,
            messages=messages,
            tools=tools,
            tool_choice="auto"
        )
    except Exception as e:
        print(f"Error calling model: {e}")
        raise

def run_agent(user_prompt: str, site_limit: Optional[str] = None, model: str = DEFAULT_MODEL) -> str:
    """
    High level prompt for a news agent.
    It may search, read links, then synthesize and cite URLs.
    """
    system = {
        "role": "system",
        "content": (
            "You are a careful news agent. Follow these steps:\n"
            "1. For general news requests: Use fetch_google_news_rss to get top headlines\n"
            "2. For specific topic requests: Use serper_news_search with the topic\n"
            "3. ONLY use fetch_article when the user asks specific questions about article content, details, or wants to analyze/quote from particular articles\n"
            "4. For general news summaries, provide information based on headlines and snippets without fetching full articles\n"
            "5. STOP calling tools and provide your final answer\n"
            "6. Always include a bullet list of sources with URLs\n"
            "IMPORTANT: After reading articles (if any), you must provide your final answer without calling more tools.\n\n"
            "TOOL SELECTION GUIDE:\n"
            "- fetch_google_news_rss: Use for 'what's happening today' or 'top news' requests\n"
            "- serper_news_search: Use for specific topics like 'AI chips', 'Nvidia', 'climate change'\n"
            "- serper_site_search: Use when restricted to specific news sources\n"
            "- fetch_article: ONLY use when user asks about specific article content, details, or wants to analyze particular articles\n"
            "PRIORITY: For general news requests, provide summaries based on headlines and snippets. Only fetch full articles when specifically needed for detailed analysis.\n"
        ),
    }

    messages: List[Dict[str, str]] = [system, {"role": "user", "content": user_prompt}]
    if site_limit:
        messages.append({"role": "user", "content": f"Restrict searches to {site_limit} when appropriate."})

    for step in range(6):  # small safety cap
        try:
            resp = call_model(messages, model=model)
            msg = resp.choices[0].message

            # If the model wants to call tools
            if getattr(msg, "tool_calls", None) and msg.tool_calls:
                # Add the assistant message with tool calls to the conversation
                assistant_message = {
                    "role": "assistant",
                    "content": msg.content or "",
                    "tool_calls": [
                        {
                            "id": tool_call.id,
                            "type": "function",
                            "function": {
                                "name": tool_call.function.name,
                                "arguments": tool_call.function.arguments
                            }
                        }
                        for tool_call in msg.tool_calls
                    ]
                }
                messages.append(assistant_message)
                
                # Process each tool call
                for tool_call in msg.tool_calls:
                    name = tool_call.function.name
                    args = {}
                    try:
                        args = json.loads(tool_call.function.arguments or "{}")
                    except json.JSONDecodeError:
                        args = {}

                    fn = FUNCTION_MAP.get(name)
                    if not fn:
                        messages.append({
                            "role": "tool",
                            "tool_call_id": tool_call.id,
                            "name": name,
                            "content": json.dumps({"ok": False, "error": "unknown_tool"})
                        })
                        continue

                    try:
                        result = fn(**args)
                    except TypeError as e:
                        result = {"ok": False, "error": f"bad_args: {e}"}
                    except Exception as e:
                        result = {"ok": False, "error": repr(e)}

                    tool_response = {
                        "role": "tool",
                        "tool_call_id": tool_call.id,
                        "name": name,
                        "content": json.dumps(result),
                    }
                    messages.append(tool_response)
                
                # After processing tools, add a reminder to synthesize
                if step >= 2:  # After 2+ tool calls, encourage synthesis
                    messages.append({
                        "role": "user",
                        "content": "You now have sufficient information. Please provide your final answer with sources."
                    })
                
                # Continue loop so the model can see tool outputs
                continue

            # If we have a final assistant message without tool calls
            if msg.content:
                return msg.content

            # Fallback tiny sleep then continue
            time.sleep(0.2)
            
        except Exception as e:
            # If there's an error, try to continue or return error message
            if step == 5:  # Last step
                return f"Error occurred during processing: {e}"
            time.sleep(0.5)
            continue

    return "I could not complete the task within the step limit. Try refining your query."

7) UI: Gradio chat app

The Gradio interface turns your backend logic into an interactive web app with almost no extra code. It gives users: • A simple chat window for questions and answers • A model selector for switching between GPT-OSS variants • Example prompts to guide usage

This makes your agent easy to demo, share, and iterate on without building a custom frontend.

Every interaction is wrapped with Langfuse tracing, so you can inspect inputs, outputs, tool usage, and errors — making it easy to debug, fine-tune prompts, and monitor performance in real time.

Wrap chat_with_agent in @observe()
Logs inputs, model choice, and history length
Logs outputs + metadata (length, success), and errors with success=False

Code:

@observe()
def chat_with_agent(message, history, model):
    """Handle chat messages and return agent responses."""
    if not message.strip():
        return history
    
    lf = get_client()
    lf.update_current_trace(
        input={"user_message": message, "model": model, "history_length": len(history)}
    )

    try:
        response = run_agent(message, None, model)

        lf.update_current_trace(
            output={"agent_response": response},
            metadata={
                "model": model,
                "message_length": len(message),
                "response_length": len(response),
                "success": True,
            },
        )

        history.append({"role": "user", "content": message})
        history.append({"role": "assistant", "content": response})
        return history
        
    except Exception as e:
        lf.update_current_trace(
            output={"error": str(e)},
            metadata={"success": False, "error": str(e)},
        )
        error_msg = f"Sorry, I encountered an error: {str(e)}"
        history.append({"role": "user", "content": message})
        history.append({"role": "assistant", "content": error_msg})
        return history

def clear_chat():
    """Clear the chat history."""
    return [], ""

# Create the Gradio interface
with gr.Blocks(
    title="Chat with the News",
    theme=gr.themes.Monochrome()
) as demo:
    
    # Header using Gradio markdown
    gr.Markdown("""
    # 📰 Chat with the News
    
    Your AI-powered news research assistant with real-time search capabilities, based on [GPT-OSS models](https://huggingface.co/collections/openai/gpt-oss-68911959590a1634ba11c7a4) and running on inference providers.
    """)
    
    # Examples section using Gradio markdown
    gr.Markdown("""
    ### 💡 Try these examples:
    
    - **General:** "What are the top news stories today?"
    - **Specific topic:** "What's the latest on artificial intelligence?"
    - **Site-specific:** "What's the latest climate change news on the BBC?"
    """)

    # Model selector
    model_selector = gr.Dropdown(
        choices=AVAILABLE_MODELS,
        value=DEFAULT_MODEL,
        label="🤖 Select Model",
        info="Choose between GPT-OSS 120B and 20B models"
    )

    # Message input
    msg = gr.Textbox(
        label="Ask me about the news",
        placeholder="What would you like to know about today?",
        lines=2
    )
    
    # Buttons in a row
    with gr.Row():
        submit_btn = gr.Button("🚀 Send", variant="primary", size="lg")
        clear_btn = gr.Button("🗑️ Clear Chat", variant="secondary", size="lg")
        
    # Chat interface
    chatbot = gr.Chatbot(
        label="News Agent",
        height=500,
        show_label=False,
        container=True,
        type="messages"
    )
    
    # Event handlers
    submit_btn.click(
        chat_with_agent,
        inputs=[msg, chatbot, model_selector],
        outputs=[chatbot],
        show_progress=True
    )
    
    msg.submit(
        chat_with_agent,
        inputs=[msg, chatbot, model_selector],
        outputs=[chatbot],
        show_progress=True
    )
    
    clear_btn.click(
        clear_chat,
        outputs=[chatbot, msg]
    )
    
    # Instructions using Gradio markdown
    gr.Markdown("""
    ---
    
    ### ℹ️ How it works
    
    This AI agent can search Google News, fetch articles from specific sources, and provide comprehensive news summaries with proper citations. It uses real-time data and can restrict searches to specific news domains when requested.
    
    **Model Selection:**
    - **GPT-OSS 120B**: Larger, more capable model for complex reasoning tasks
    - **GPT-OSS 20B**: Faster, more efficient model for quick responses
    """)

# Launch the app
if __name__ == "__main__":
    demo.launch(
        server_name="0.0.0.0",
        server_port=7860,
        share=False,
        show_error=True
    )

Recap

You now have:

A lightweight, controlled news agent
GPT-OSS via Hugging Face inference
Tool selection rules
Langfuse tracing for observability
A ready-to-use Gradio UI

Community

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment

Upvote