Ahmud's picture
agents
9c9b3ff
|
raw
history blame
3.28 kB

Hugging Face AI Agents Course - Final Exam Agent

This project contains an AI agent developed for the final exam of the Hugging Face AI Agents Course. The agent is designed to answer a variety of questions by leveraging a suite of powerful tools and a language model.

Overview

This agent is built using the LangGraph library to create a robust and stateful agent. It can perform a variety of tasks, including web searches, calculations, code execution, and processing different types of media like audio, images, and documents. The project includes a Gradio application for evaluating the agent's performance on a set of questions provided by the course.

Features

  • Multi-tool Integration: The agent can use a wide range of tools to solve complex problems.
  • Conversational AI: Powered by a capable language model from OpenRouter.
  • Stateful Execution: Uses LangGraph to manage the conversation flow and tool execution in a structured manner.
  • Web Interface: A Gradio app (app.py) is provided to test and evaluate the agent.
  • Extensible: New tools can be easily added to enhance the agent's capabilities.

Tools

The agent has access to the following tools:

Community Tools

  • Brave Search: Performs web searches to find up-to-date information.
  • Python REPL: Executes Python code to solve logic and math problems.

Custom Tools

  • Calculator:
    • add(a, b): Adds two numbers.
    • subtract(a, b): Subtracts two numbers.
    • multiply(a, b): Multiplies two numbers.
    • divide(a, b): Divides two numbers.
    • power(a, b): Calculates a to the power of b.
  • Date & Time:
    • current_date(): Returns the current date.
    • day_of_week(): Returns the current day of the week.
    • days_until(date_str): Calculates the number of days until a given date.
  • Media Processing:
    • transcribe_audio(audio_file, file_extension): Transcribes audio files.
    • transcribe_youtube(youtube_url): Transcribes YouTube videos.
    • query_image(query, image_url): Answers questions about an image.
  • Web & Document Content:
    • webpage_content(url): Extracts text from webpages and PDF files.
    • read_excel(file_path, sheet_name, query): Reads data from an Excel file and answers a query about it.

How It Works

The agent's logic is defined in agent.py. It uses a StateGraph from the LangGraph library to manage its execution flow. The graph has two main nodes:

  1. llm_call: This node calls the language model with the current conversation history and a system prompt (prompt.py). The LLM decides whether to respond directly to the user or to use one of the available tools.
  2. environment: If the LLM decides to use a tool, this node executes the tool with the arguments provided by the LLM.

The agent alternates between these two nodes until the LLM generates a final answer for the user.

Usage

1. Installation

Clone the repository and install the required dependencies:

git clone https://huggingface.co/spaces/YOUR_SPACE_HERE
cd YOUR_REPO
pip install -r requirements.txt

Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference