Luong Huu Thanh commited on
Commit
490f257
·
1 Parent(s): 31d9176

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +84 -1
README.md CHANGED
@@ -1 +1,84 @@
1
- # gaia-agent
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # **GAIA Agent**
2
+
3
+ ## **Introduction**
4
+
5
+ **GAIA Agent** is an automated system built to tackle and submit solutions for the GAIA benchmark, which tests the capabilities of general-purpose AI agents on diverse and challenging tasks. These tasks require a combination of reasoning, code execution, information retrieval, data interpretation, and multimodal understanding. Powered by advanced language models (such as HuggingFace, and Groq), the agent incorporates a versatile set of tools including browser tools, code interpreter tools, mathematical tools, document processing tools, image processing and generation tools. It is designed for seamless interaction with the benchmark, offering automatic evaluation, submission, and result display through a user-friendly Gradio interface.
6
+
7
+ ## **Tools Implementation**
8
+
9
+ ### **Browser tools**
10
+ - **Wikipedia Search:** Search Wikipedia for a query and return maximum 2 results.
11
+ - **Web Search:** Search the web for a query and return maximum 2 results.
12
+ - **Arxiv Search:** Search arXiv for a query and return maximum 2 results.
13
+
14
+ ### **Code interpreter tools**
15
+ - **Execute Multi-programming Language:** Execute code in multiple languages (Python, Bash, SQL, C, Java) and return results.
16
+
17
+ ### **Mathematical tools**
18
+ - **Multiplication Tools:** Multiplies 2 numbers
19
+ - **Addition:** Adds 2 numbers
20
+ - **Subtraction:** Subtracts 2 numbers
21
+ - **Division:** Divides 2 numbers
22
+ - **Modulus:** Get the modulus of 2 numbers
23
+ - **Power:** Get the power of 2 numbers
24
+ - **Square root:** Get the square root of a number
25
+
26
+ ### **Document processing tools**
27
+ - **Save and Read File:** Save content to a file and return the path
28
+ - **Download a File from URL:** Download a file from a URL and save it to a temporary location
29
+ - **Extract Text from Image:** Extract text from an image using OCR library pytesseract (if available)
30
+ - **Analyze CSV File:** Analyze a CSV file using pandas and answer a question about it
31
+ - **Analyze Excel File:** Analyze an Excel file using pandas and answer a question about it
32
+
33
+ ### **Image processing and generation tools**
34
+ - **Analyze Image:** Analyze basic properties of an image (size, mode, color analysis, thumbnail preview)
35
+ - **Transform Image:** Apply transformations: resize, rotate, crop, flip, brightness, contrast, blur, sharpen, grayscale
36
+ - **Draw on Image:** Draw shapes (rectangle, circle, line) or text onto an image
37
+ - **Generate Simple Image:** Generate a simple image (gradient, noise, pattern, chart)
38
+ - **Combine Images:** Combine multiple images (collage, stack, blend)
39
+
40
+
41
+ ## **Installation**
42
+ Clone the repository, change the current working directory to this repository's root folder:
43
+
44
+ ```
45
+ git clone https://github.com/fisherman611/gaia-agent.git
46
+ ```
47
+ ```
48
+ cd gaia-agent
49
+ ```
50
+
51
+ Install ```requirements.txt``` (replace `3.11` with your installed Python version):
52
+
53
+ ```
54
+ py -3.11 -m pip install -r requirements.txt
55
+ ```
56
+
57
+ ## **Environment Variables**
58
+ Store some API keys in the `.env` file and load it in your code using `load_dotenv`
59
+
60
+ ```
61
+ SUPABASE_URL=...
62
+ SUPABASE_SERVICE_ROLE_KEY=...
63
+ SUPABASE_SERVICE_KEY=...
64
+ HUGGINGFACEHUB_API_TOKEN=...
65
+ GROQ_API_KEY=...
66
+ ```
67
+
68
+ ## **Demo**
69
+ To run the application using the command line, use the following command (replace `3.11` with your installed Python version):
70
+ ```
71
+ py -3.11 app.py
72
+ ```
73
+
74
+ ## **Resources**
75
+ - [GAIA Benchmark](https://huggingface.co/spaces/gaia-benchmark/leaderboard)
76
+ - [Hugging Face Agents Course](https://huggingface.co/agents-course)
77
+ - [Langgraph Agents](https://langchain-ai.github.io/langgraph/)
78
+
79
+
80
+ ## **Contributing**
81
+ Contributions are welcome! If you find any issues or have suggestions for improvements, please open an issue or submit a pull request.
82
+
83
+ ## **License**
84
+ This project is licensed under the [MIT License](https://mit-license.org/).