dlaima's picture
Update README.md
2b4218b verified
|
raw
history blame
1.77 kB
metadata
title: Template Final Assignment
emoji: πŸ•΅πŸ»β€β™‚οΈ
colorFrom: indigo
colorTo: indigo
sdk: gradio
sdk_version: 5.25.2
app_file: app.py
pinned: false
hf_oauth: true
hf_oauth_expiration_minutes: 480
short_description: Gemini Agent for GAIA Evaluation

🧠 Gemini Agent for GAIA Evaluation

This project contains a Gemini-powered CodeAgent built with smolagents for use in the GAIA Unit 4 Evaluation on Hugging Face Spaces.

πŸš€ Features

  • Uses the Gemini 2.0 Flash model via LiteLLMModel

  • Equipped with essential tools:

    • DuckDuckGoSearchTool for quick lookups
    • RunPythonFileTool for executing .py scripts
    • ReverseTextTool for decoding reversed questions
    • download_server for fetching files from URLs
    • Base tools (math, string manipulation, etc.)

πŸ“‹ Evaluation Strategy

The agent reads questions from the GAIA evaluation endpoint, applies reasoning using a system prompt with strict guidelines, and submits answers back for scoring.

πŸ› οΈ Setup

  1. Clone this repository or Space

  2. Set your environment variables:

    GEMINI_API_KEY=your_api_key_here
    SPACE_ID=your_hf_space_id
    
  3. Install dependencies:

    pip install -r requirements.txt
    
  4. Run locally:

    python app.py
    

Or launch directly via Hugging Face Spaces.

πŸ§ͺ Evaluation Flow

  1. Log in to Hugging Face through the UI
  2. Click β€œRun Evaluation & Submit All Answers”
  3. The agent will fetch tasks, solve them, and submit results

Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference