Spaces:
Sleeping
Sleeping
Update README.md
Browse files
README.md
CHANGED
@@ -8,8 +8,75 @@ sdk_version: 5.25.2
|
|
8 |
app_file: app.py
|
9 |
pinned: false
|
10 |
hf_oauth: true
|
11 |
-
# optional, default duration is 8 hours/480 minutes. Max duration is 30 days/43200 minutes.
|
12 |
hf_oauth_expiration_minutes: 480
|
|
|
13 |
---
|
14 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
15 |
Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
|
|
|
8 |
app_file: app.py
|
9 |
pinned: false
|
10 |
hf_oauth: true
|
|
|
11 |
hf_oauth_expiration_minutes: 480
|
12 |
+
short_description: Gemini Agent for GAIA Evaluation
|
13 |
---
|
14 |
|
15 |
+
# 🧠 Gemini Agent for GAIA Evaluation
|
16 |
+
|
17 |
+
This project contains a Gemini-powered CodeAgent built with [smolagents](https://github.com/smol-ai/smol-agents) for use
|
18 |
+
in the **GAIA Unit 4 Evaluation** on [Hugging Face Spaces](https://hf.co/learn/agents-course/unit0/introduction).
|
19 |
+
|
20 |
+
## 🚀 Features
|
21 |
+
|
22 |
+
* Uses the **Gemini 2.0 Flash** model via `LiteLLMModel`
|
23 |
+
* Equipped with essential tools:
|
24 |
+
|
25 |
+
* `DuckDuckGoSearchTool` for quick lookups
|
26 |
+
* `RunPythonFileTool` for executing `.py` scripts
|
27 |
+
* `ReverseTextTool` for decoding reversed questions
|
28 |
+
* `download_server` for fetching files from URLs
|
29 |
+
* Base tools (math, string manipulation, etc.)
|
30 |
+
|
31 |
+
## 📋 Evaluation Strategy
|
32 |
+
|
33 |
+
The agent reads questions from the GAIA evaluation endpoint, applies reasoning using a system prompt with strict
|
34 |
+
guidelines, and submits answers back for scoring.
|
35 |
+
|
36 |
+
|
37 |
+
## 🛠️ Setup
|
38 |
+
|
39 |
+
1. Clone this repository or Space
|
40 |
+
2. Set your environment variables:
|
41 |
+
|
42 |
+
```
|
43 |
+
GEMINI_API_KEY=your_api_key_here
|
44 |
+
SPACE_ID=your_hf_space_id
|
45 |
+
```
|
46 |
+
3. Install dependencies:
|
47 |
+
|
48 |
+
```bash
|
49 |
+
pip install -r requirements.txt
|
50 |
+
```
|
51 |
+
4. Run locally:
|
52 |
+
|
53 |
+
```bash
|
54 |
+
python app.py
|
55 |
+
```
|
56 |
+
|
57 |
+
Or launch directly via [Hugging Face Spaces](https://huggingface.co/spaces/).
|
58 |
+
|
59 |
+
## 🧪 Evaluation Flow
|
60 |
+
|
61 |
+
1. Log in to Hugging Face through the UI
|
62 |
+
2. Click “Run Evaluation & Submit All Answers”
|
63 |
+
3. The agent will fetch tasks, solve them, and submit results
|
64 |
+
|
65 |
+
|
66 |
+
|
67 |
+
|
68 |
+
|
69 |
+
|
70 |
+
|
71 |
+
|
72 |
+
|
73 |
+
|
74 |
+
|
75 |
+
|
76 |
+
|
77 |
+
|
78 |
+
|
79 |
+
|
80 |
+
|
81 |
+
|
82 |
Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
|