dlaima commited on
Commit
2b4218b
·
verified ·
1 Parent(s): 9715e6d

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +68 -1
README.md CHANGED
@@ -8,8 +8,75 @@ sdk_version: 5.25.2
8
  app_file: app.py
9
  pinned: false
10
  hf_oauth: true
11
- # optional, default duration is 8 hours/480 minutes. Max duration is 30 days/43200 minutes.
12
  hf_oauth_expiration_minutes: 480
 
13
  ---
14
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
15
  Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
 
8
  app_file: app.py
9
  pinned: false
10
  hf_oauth: true
 
11
  hf_oauth_expiration_minutes: 480
12
+ short_description: Gemini Agent for GAIA Evaluation
13
  ---
14
 
15
+ # 🧠 Gemini Agent for GAIA Evaluation
16
+
17
+ This project contains a Gemini-powered CodeAgent built with [smolagents](https://github.com/smol-ai/smol-agents) for use
18
+ in the **GAIA Unit 4 Evaluation** on [Hugging Face Spaces](https://hf.co/learn/agents-course/unit0/introduction).
19
+
20
+ ## 🚀 Features
21
+
22
+ * Uses the **Gemini 2.0 Flash** model via `LiteLLMModel`
23
+ * Equipped with essential tools:
24
+
25
+ * `DuckDuckGoSearchTool` for quick lookups
26
+ * `RunPythonFileTool` for executing `.py` scripts
27
+ * `ReverseTextTool` for decoding reversed questions
28
+ * `download_server` for fetching files from URLs
29
+ * Base tools (math, string manipulation, etc.)
30
+
31
+ ## 📋 Evaluation Strategy
32
+
33
+ The agent reads questions from the GAIA evaluation endpoint, applies reasoning using a system prompt with strict
34
+ guidelines, and submits answers back for scoring.
35
+
36
+
37
+ ## 🛠️ Setup
38
+
39
+ 1. Clone this repository or Space
40
+ 2. Set your environment variables:
41
+
42
+ ```
43
+ GEMINI_API_KEY=your_api_key_here
44
+ SPACE_ID=your_hf_space_id
45
+ ```
46
+ 3. Install dependencies:
47
+
48
+ ```bash
49
+ pip install -r requirements.txt
50
+ ```
51
+ 4. Run locally:
52
+
53
+ ```bash
54
+ python app.py
55
+ ```
56
+
57
+ Or launch directly via [Hugging Face Spaces](https://huggingface.co/spaces/).
58
+
59
+ ## 🧪 Evaluation Flow
60
+
61
+ 1. Log in to Hugging Face through the UI
62
+ 2. Click “Run Evaluation & Submit All Answers”
63
+ 3. The agent will fetch tasks, solve them, and submit results
64
+
65
+
66
+
67
+
68
+
69
+
70
+
71
+
72
+
73
+
74
+
75
+
76
+
77
+
78
+
79
+
80
+
81
+
82
  Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference