Yago Bolivar commited on
Commit
a3c3cd5
·
1 Parent(s): 13efa1c

chore: remove outdated evaluation script documentation

Browse files
Files changed (1) hide show
  1. docs/evaluate_local_commands.md +0 -17
docs/evaluate_local_commands.md DELETED
@@ -1,17 +0,0 @@
1
- **Run the Evaluation Script:** Open your terminal, navigate to the `utilities` directory, and run the script:
2
-
3
- * **Evaluate all levels:**
4
- ```bash
5
- cd /Users/yagoairm2/Desktop/agents/final\ projectHF_Agents_Final_Project/utilities
6
- python evaluate_local.py --answers_file .agent_answers.json
7
- ```
8
- * **Evaluate only Level 1:**
9
- ```bash
10
- pythonevaluate_local.py --answers_file ../gent_answers.json --level 1
11
- ```
12
- * **Evaluate Level 1 and show incorrect answers:**
13
- ```bash
14
- python evaluate_local.py --answers_file ..agent_answers.json --level 1 --verbose
15
- ```
16
-
17
- This script will calculate and print the accuracy based on the exact match criterion used by GAIA, without submitting anything to the official leaderboard.