Final_Assignment_Template3

Running

App Files Files Community

Duibonduil commited on 6 days ago

Commit

68e0793

verified ·

1 Parent(s): 987d77e

Upload 7 files

Browse files

Files changed (7) hide show

examples/open_deep_research/README.md +64 -0
examples/open_deep_research/analysis.ipynb +457 -0
examples/open_deep_research/app.py +11 -0
examples/open_deep_research/requirements.txt +39 -0
examples/open_deep_research/run.py +125 -0
examples/open_deep_research/run_gaia.py +303 -0
examples/open_deep_research/visual_vs_text_browser.ipynb +359 -0

examples/open_deep_research/README.md ADDED Viewed

	@@ -0,0 +1,64 @@

+# Open Deep Research
+Welcome to this open replication of [OpenAI's Deep Research](https://openai.com/index/introducing-deep-research/)! This agent attempts to replicate OpenAI's model and achieve similar performance on research tasks.
+Read more about this implementation's goal and methods in our [blog post](https://huggingface.co/blog/open-deep-research).
+This agent achieves **55% pass@1** on the GAIA validation set, compared to **67%** for the original Deep Research.
+## Setup
+To get started, follow the steps below:
+### Clone the repository
+```bash
+git clone https://github.com/huggingface/smolagents.git
+cd smolagents/examples/open_deep_research
+```
+### Install dependencies
+Run the following command to install the required dependencies from the `requirements.txt` file:
+```bash
+pip install -r requirements.txt
+```
+### Install the development version of `smolagents`
+```bash
+pip install -e ../../.[dev]
+```
+### Set up environment variables
+The agent uses the `GoogleSearchTool` for web search, which requires an environment variable with the corresponding API key, based on the selected provider:
+- `SERPAPI_API_KEY` for SerpApi: [Sign up here to get a key](https://serpapi.com/users/sign_up)
+- `SERPER_API_KEY` for Serper: [Sign up here to get a key](https://serper.dev/signup)
+Depending on the model you want to use, you may need to set environment variables.
+For example, to use the default `o1` model, you need to set the `OPENAI_API_KEY` environment variable.
+[Sign up here to get a key](https://platform.openai.com/signup).
+> [!WARNING]
+> The use of the default `o1` model is restricted to tier-3 access: https://help.openai.com/en/articles/10362446-api-access-to-o1-and-o3-mini
+## Usage
+Then you're good to go! Run the run.py script, as in:
+```bash
+python run.py --model-id "o1" "Your question here!"
+```
+## Full reproducibility of results
+The data used in our submissions to GAIA was augmented in this way:
+ -  For each single-page .pdf or .xls file, it was opened in a file reader (MacOS Sonoma Numbers or Preview), and a ".png" screenshot was taken and added to the folder.
+- Then for any file used in a question, the file loading system checks if there is a ".png" extension version of the file, and loads it instead of the original if it exists.
+This process was done manually but could be automatized.
+After processing, the annotated was uploaded to a [new dataset](https://huggingface.co/datasets/smolagents/GAIA-annotated). You need to request access (granted instantly).

examples/open_deep_research/analysis.ipynb ADDED Viewed

	@@ -0,0 +1,457 @@

+{
+ "cells": [
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "!pip install plotly kaleido datasets nbformat -U -q"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import os\n",
+    "\n",
+    "import datasets\n",
+    "import pandas as pd\n",
+    "from dotenv import load_dotenv\n",
+    "from huggingface_hub import login\n",
+    "\n",
+    "\n",
+    "load_dotenv(override=True)\n",
+    "login(os.getenv(\"HF_TOKEN\"))\n",
+    "\n",
+    "pd.set_option(\"max_colwidth\", None)\n",
+    "\n",
+    "OUTPUT_DIR = \"output\""
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "eval_ds = datasets.load_dataset(\"gaia-benchmark/GAIA\", \"2023_all\")[\"validation\"]\n",
+    "eval_ds = eval_ds.rename_columns({\"Question\": \"question\", \"Final answer\": \"true_answer\", \"Level\": \"task\"})\n",
+    "eval_df = pd.DataFrame(eval_ds)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# 1. Load all results"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 88,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import glob\n",
+    "\n",
+    "\n",
+    "results = []\n",
+    "for f in glob.glob(f\"{OUTPUT_DIR}/validation/*.jsonl\"):\n",
+    "    df = pd.read_json(f, lines=True)\n",
+    "    df[\"agent_name\"] = f.split(\"/\")[-1].split(\".\")[0]\n",
+    "    results.append(df)\n",
+    "\n",
+    "result_df = pd.concat(results)\n",
+    "result_df[\"prediction\"] = result_df[\"prediction\"].fillna(\"No prediction\")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import re\n",
+    "from collections import Counter\n",
+    "\n",
+    "from scripts.gaia_scorer import check_close_call, question_scorer\n",
+    "\n",
+    "\n",
+    "result_df[\"is_correct\"] = result_df.apply(lambda x: question_scorer(x[\"prediction\"], x[\"true_answer\"]), axis=1)\n",
+    "result_df[\"is_near_correct\"] = result_df.apply(\n",
+    "    lambda x: check_close_call(x[\"prediction\"], x[\"true_answer\"], x[\"is_correct\"]),\n",
+    "    axis=1,\n",
+    ")\n",
+    "\n",
+    "result_df[\"count_steps\"] = result_df[\"intermediate_steps\"].apply(len)\n",
+    "\n",
+    "\n",
+    "def find_attachment(question):\n",
+    "    matches = eval_df.loc[eval_df[\"question\"].apply(lambda x: x in question), \"file_name\"]\n",
+    "\n",
+    "    if len(matches) == 0:\n",
+    "        return \"Not found\"\n",
+    "    file_path = matches.values[0]\n",
+    "\n",
+    "    if isinstance(file_path, str) and len(file_path) > 0:\n",
+    "        return file_path.split(\".\")[-1]\n",
+    "    else:\n",
+    "        return \"None\"\n",
+    "\n",
+    "\n",
+    "result_df[\"attachment_type\"] = result_df[\"question\"].apply(find_attachment)\n",
+    "\n",
+    "\n",
+    "def extract_tool_calls(code):\n",
+    "    regex = r\"\\b(\\w+)\\(\"\n",
+    "    function_calls = [el for el in re.findall(regex, code) if el.islower()]\n",
+    "\n",
+    "    function_call_counter = Counter(function_calls)\n",
+    "    return function_call_counter\n",
+    "\n",
+    "\n",
+    "def sum_tool_calls(steps):\n",
+    "    total_count = Counter()\n",
+    "    for step in steps:\n",
+    "        if \"llm_output\" in step:\n",
+    "            total_count += extract_tool_calls(step[\"llm_output\"])\n",
+    "\n",
+    "    return total_count\n",
+    "\n",
+    "\n",
+    "def get_durations(row):\n",
+    "    # start_datetime = datetime.strptime(row['start_time'], \"%Y-%m-%d %H:%M:%S\")\n",
+    "    # end_datetime = datetime.strptime(row['end_time'], \"%Y-%m-%d %H:%M:%S\")\n",
+    "\n",
+    "    duration_timedelta = row[\"end_time\"] - row[\"start_time\"]\n",
+    "    return int(duration_timedelta.total_seconds())\n",
+    "\n",
+    "\n",
+    "result_df[\"duration\"] = result_df.apply(get_durations, axis=1)\n",
+    "# result_df[\"tool_calls\"] = result_df[\"intermediate_steps\"].apply(sum_tool_calls)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "result_df[\"agent_name\"].value_counts()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# 2. Inspect specific runs"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "sel_df = result_df\n",
+    "# sel_df = sel_df.loc[\n",
+    "#     (result_df[\"agent_name\"].isin(list_versions))\n",
+    "# ]\n",
+    "sel_df = sel_df.reset_index(drop=True)\n",
+    "display(sel_df[\"agent_name\"].value_counts())\n",
+    "sel_df = sel_df.drop_duplicates(subset=[\"agent_name\", \"question\"])\n",
+    "display(sel_df.groupby(\"agent_name\")[[\"task\"]].value_counts())\n",
+    "print(\"Total length:\", len(sel_df), \"- is complete:\", len(sel_df) == 165)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "display(\"Average score:\", sel_df.groupby(\"agent_name\")[[\"is_correct\"]].mean().round(3))\n",
+    "display(\n",
+    "    sel_df.groupby([\"agent_name\", \"task\"])[[\"is_correct\", \"is_near_correct\", \"count_steps\", \"question\", \"duration\"]]\n",
+    "    .agg(\n",
+    "        {\n",
+    "            \"is_correct\": \"mean\",\n",
+    "            \"is_near_correct\": \"mean\",\n",
+    "            \"count_steps\": \"mean\",\n",
+    "            \"question\": \"count\",\n",
+    "            \"duration\": \"mean\",\n",
+    "        }\n",
+    "    )\n",
+    "    .rename(columns={\"question\": \"count\"})\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import plotly.express as px\n",
+    "\n",
+    "\n",
+    "cumulative_df = (\n",
+    "    (\n",
+    "        sel_df.groupby(\"agent_name\")[[\"is_correct\", \"is_near_correct\"]]\n",
+    "        .expanding(min_periods=1, axis=0, method=\"single\")\n",
+    "        .agg({\"is_correct\": \"mean\", \"is_near_correct\": \"count\"})\n",
+    "        .reset_index()\n",
+    "    )\n",
+    "    .copy()\n",
+    "    .rename(columns={\"is_near_correct\": \"index\"})\n",
+    ")\n",
+    "cumulative_df[\"index\"] = cumulative_df[\"index\"].astype(int) - 1\n",
+    "\n",
+    "\n",
+    "def find_question(row):\n",
+    "    try:\n",
+    "        res = sel_df.loc[sel_df[\"agent_name\"] == row[\"agent_name\"], \"question\"].iloc[row[\"index\"]][:50]\n",
+    "        return res\n",
+    "    except Exception:\n",
+    "        return \"\"\n",
+    "\n",
+    "\n",
+    "cumulative_df[\"question\"] = cumulative_df.apply(find_question, axis=1)\n",
+    "\n",
+    "px.line(\n",
+    "    cumulative_df,\n",
+    "    color=\"agent_name\",\n",
+    "    x=\"index\",\n",
+    "    y=\"is_correct\",\n",
+    "    hover_data=\"question\",\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# 3. Dive deeper into one run"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "sel_df = result_df.loc[result_df[\"agent_name\"] == \"o1\"]\n",
+    "print(len(sel_df))"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### Count errors"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import numpy as np\n",
+    "\n",
+    "\n",
+    "error_types = [\n",
+    "    \"AgentParsingError\",\n",
+    "    \"AgentExecutionError\",\n",
+    "    \"AgentMaxIterationsError\",\n",
+    "    \"AgentGenerationError\",\n",
+    "]\n",
+    "sel_df[error_types] = 0\n",
+    "sel_df[\"Count steps\"] = np.nan\n",
+    "\n",
+    "\n",
+    "def count_errors(row):\n",
+    "    if isinstance(row[\"intermediate_steps\"], list):\n",
+    "        row[\"Count steps\"] = len(row[\"intermediate_steps\"])\n",
+    "        for step in row[\"intermediate_steps\"]:\n",
+    "            if isinstance(step, dict) and \"error\" in step:\n",
+    "                try:\n",
+    "                    row[str(step[\"error\"][\"error_type\"])] += 1\n",
+    "                except Exception:\n",
+    "                    pass\n",
+    "    return row\n",
+    "\n",
+    "\n",
+    "sel_df = sel_df.apply(count_errors, axis=1)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import plotly.express as px\n",
+    "\n",
+    "\n",
+    "aggregate_errors = (\n",
+    "    sel_df.groupby([\"is_correct\"])[error_types + [\"Count steps\"]].mean().reset_index().melt(id_vars=[\"is_correct\"])\n",
+    ")\n",
+    "\n",
+    "fig = px.bar(\n",
+    "    aggregate_errors,\n",
+    "    y=\"value\",\n",
+    "    x=\"variable\",\n",
+    "    color=\"is_correct\",\n",
+    "    labels={\n",
+    "        \"agent_name\": \"<b>Model</b>\",\n",
+    "        \"task\": \"<b>Level</b>\",\n",
+    "        \"aggregate_score\": \"<b>Performance</b>\",\n",
+    "        \"value\": \"<b>Average count</b>\",\n",
+    "        \"eval_score_GPT4\": \"<b>Score</b>\",\n",
+    "    },\n",
+    ")\n",
+    "fig.update_layout(\n",
+    "    height=500,\n",
+    "    width=800,\n",
+    "    barmode=\"group\",\n",
+    "    bargroupgap=0.0,\n",
+    ")\n",
+    "fig.update_traces(textposition=\"outside\")\n",
+    "fig.write_image(\"aggregate_errors.png\", scale=3)\n",
+    "fig.show()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### Inspect result by file extension type"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "display(\n",
+    "    result_df.groupby([\"attachment_type\"])[[\"is_correct\", \"count_steps\", \"question\"]].agg(\n",
+    "        {\"is_correct\": \"mean\", \"count_steps\": \"mean\", \"question\": \"count\"}\n",
+    "    )\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# 4. Ensembling methods"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "counts = result_df[\"agent_name\"].value_counts()\n",
+    "long_series = result_df.loc[result_df[\"agent_name\"].isin(counts[counts > 140].index)]"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "def majority_vote(df):\n",
+    "    df = df[(df[\"prediction\"] != \"Unable to determine\") & (~df[\"prediction\"].isna()) & (df[\"prediction\"] != \"None\")]\n",
+    "\n",
+    "    answer_modes = df.groupby(\"question\")[\"prediction\"].agg(lambda x: x.mode()[0]).reset_index()\n",
+    "    first_occurrences = (\n",
+    "        df.groupby([\"question\", \"prediction\"]).agg({\"task\": \"first\", \"is_correct\": \"first\"}).reset_index()\n",
+    "    )\n",
+    "    result = answer_modes.merge(first_occurrences, on=[\"question\", \"prediction\"], how=\"left\")\n",
+    "\n",
+    "    return result\n",
+    "\n",
+    "\n",
+    "def oracle(df):\n",
+    "    def get_first_correct_or_first_wrong(group):\n",
+    "        correct_answers = group[group[\"is_correct\"]]\n",
+    "        if len(correct_answers) > 0:\n",
+    "            return correct_answers.iloc[0]\n",
+    "        return group.iloc[0]\n",
+    "\n",
+    "    result = df.groupby(\"question\").apply(get_first_correct_or_first_wrong)\n",
+    "\n",
+    "    return result.reset_index(drop=True)\n",
+    "\n",
+    "\n",
+    "display((long_series.groupby(\"agent_name\")[\"is_correct\"].mean() * 100).round(2))\n",
+    "print(f\"Majority score: {majority_vote(long_series)['is_correct'].mean() * 100:.2f}\")\n",
+    "print(f\"Oracle score: {oracle(long_series)['is_correct'].mean() * 100:.2f}\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### Submit"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "agent_run = \"code_o1_04_february_submission5.jsonl\"\n",
+    "df = pd.read_json(f\"output/validation/{agent_run}\", lines=True)\n",
+    "df = df[[\"task_id\", \"prediction\", \"intermediate_steps\"]]\n",
+    "df = df.rename(columns={\"prediction\": \"model_answer\", \"intermediate_steps\": \"reasoning_trace\"})"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "df.to_json(\"submission.jsonl\", orient=\"records\", lines=True)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": []
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "agents",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.12.0"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 2
+}

examples/open_deep_research/app.py ADDED Viewed

	@@ -0,0 +1,11 @@

+from run import create_agent
+from smolagents.gradio_ui import GradioUI
+agent = create_agent()
+demo = GradioUI(agent)
+if __name__ == "__main__":
+    demo.launch()

examples/open_deep_research/requirements.txt ADDED Viewed

	@@ -0,0 +1,39 @@

+anthropic>=0.37.1
+audioop-lts<1.0; python_version >= "3.13" # required to use pydub in Python >=3.13; LTS port of the removed Python builtin module audioop
+beautifulsoup4>=4.12.3
+datasets>=2.21.0
+google_search_results>=2.4.2
+huggingface_hub>=0.23.4
+mammoth>=1.8.0
+markdownify>=0.13.1
+numexpr>=2.10.1
+numpy>=2.1.2
+openai>=1.52.2
+openpyxl
+pandas>=2.2.3
+pathvalidate>=3.2.1
+pdfminer>=20191125
+pdfminer.six>=20240706
+Pillow>=11.0.0
+puremagic>=1.28
+pypdf>=5.1.0
+python-dotenv>=1.0.1
+python_pptx>=1.0.2
+Requests>=2.32.3
+tqdm>=4.66.4
+torch>=2.2.2
+torchvision>=0.17.2
+transformers>=4.46.0
+youtube_transcript_api>=0.6.2
+chess
+sympy
+pubchempy
+Bio
+scikit-learn
+scipy
+pydub
+PyPDF2
+python-pptx
+torch
+xlrd
+SpeechRecognition

examples/open_deep_research/run.py ADDED Viewed

	@@ -0,0 +1,125 @@

+import argparse
+import os
+import threading
+from dotenv import load_dotenv
+from huggingface_hub import login
+from scripts.text_inspector_tool import TextInspectorTool
+from scripts.text_web_browser import (
+    ArchiveSearchTool,
+    FinderTool,
+    FindNextTool,
+    PageDownTool,
+    PageUpTool,
+    SimpleTextBrowser,
+    VisitTool,
+)
+from scripts.visual_qa import visualizer
+from smolagents import (
+    CodeAgent,
+    GoogleSearchTool,
+    # InferenceClientModel,
+    LiteLLMModel,
+    ToolCallingAgent,
+)
+load_dotenv(override=True)
+login(os.getenv("HF_TOKEN"))
+append_answer_lock = threading.Lock()
+def parse_args():
+    parser = argparse.ArgumentParser()
+    parser.add_argument(
+        "question", type=str, help="for example: 'How many studio albums did Mercedes Sosa release before 2007?'"
+    )
+    parser.add_argument("--model-id", type=str, default="o1")
+    return parser.parse_args()
+custom_role_conversions = {"tool-call": "assistant", "tool-response": "user"}
+user_agent = "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/119.0.0.0 Safari/537.36 Edg/119.0.0.0"
+BROWSER_CONFIG = {
+    "viewport_size": 1024 * 5,
+    "downloads_folder": "downloads_folder",
+    "request_kwargs": {
+        "headers": {"User-Agent": user_agent},
+        "timeout": 300,
+    },
+    "serpapi_key": os.getenv("SERPAPI_API_KEY"),
+}
+os.makedirs(f"./{BROWSER_CONFIG['downloads_folder']}", exist_ok=True)
+def create_agent(model_id="o1"):
+    model_params = {
+        "model_id": model_id,
+        "custom_role_conversions": custom_role_conversions,
+        "max_completion_tokens": 8192,
+    }
+    if model_id == "o1":
+        model_params["reasoning_effort"] = "high"
+    model = LiteLLMModel(**model_params)
+    text_limit = 100000
+    browser = SimpleTextBrowser(**BROWSER_CONFIG)
+    WEB_TOOLS = [
+        GoogleSearchTool(provider="serper"),
+        VisitTool(browser),
+        PageUpTool(browser),
+        PageDownTool(browser),
+        FinderTool(browser),
+        FindNextTool(browser),
+        ArchiveSearchTool(browser),
+        TextInspectorTool(model, text_limit),
+    ]
+    text_webbrowser_agent = ToolCallingAgent(
+        model=model,
+        tools=WEB_TOOLS,
+        max_steps=20,
+        verbosity_level=2,
+        planning_interval=4,
+        name="search_agent",
+        description="""A team member that will search the internet to answer your question.
+    Ask him for all your questions that require browsing the web.
+    Provide him as much context as possible, in particular if you need to search on a specific timeframe!
+    And don't hesitate to provide him with a complex search task, like finding a difference between two webpages.
+    Your request must be a real sentence, not a google search! Like "Find me this information (...)" rather than a few keywords.
+    """,
+        provide_run_summary=True,
+    )
+    text_webbrowser_agent.prompt_templates["managed_agent"]["task"] += """You can navigate to .txt online files.
+    If a non-html page is in another format, especially .pdf or a Youtube video, use tool 'inspect_file_as_text' to inspect it.
+    Additionally, if after some searching you find out that you need more information to answer the question, you can use `final_answer` with your request for clarification as argument to request for more information."""
+    manager_agent = CodeAgent(
+        model=model,
+        tools=[visualizer, TextInspectorTool(model, text_limit)],
+        max_steps=12,
+        verbosity_level=2,
+        additional_authorized_imports=["*"],
+        planning_interval=4,
+        managed_agents=[text_webbrowser_agent],
+    )
+    return manager_agent
+def main():
+    args = parse_args()
+    agent = create_agent(model_id=args.model_id)
+    answer = agent.run(args.question)
+    print(f"Got this answer: {answer}")
+if __name__ == "__main__":
+    main()

examples/open_deep_research/run_gaia.py ADDED Viewed

	@@ -0,0 +1,303 @@

+# EXAMPLE COMMAND: from folder examples/open_deep_research, run: python run_gaia.py --concurrency 32 --run-name generate-traces-03-apr-noplanning --model-id gpt-4o
+import argparse
+import json
+import os
+import threading
+from concurrent.futures import ThreadPoolExecutor, as_completed
+from datetime import datetime
+from pathlib import Path
+from typing import Any
+import datasets
+import pandas as pd
+from dotenv import load_dotenv
+from huggingface_hub import login, snapshot_download
+from scripts.reformulator import prepare_response
+from scripts.run_agents import (
+    get_single_file_description,
+    get_zip_description,
+)
+from scripts.text_inspector_tool import TextInspectorTool
+from scripts.text_web_browser import (
+    ArchiveSearchTool,
+    FinderTool,
+    FindNextTool,
+    PageDownTool,
+    PageUpTool,
+    SimpleTextBrowser,
+    VisitTool,
+)
+from scripts.visual_qa import visualizer
+from tqdm import tqdm
+from smolagents import (
+    CodeAgent,
+    GoogleSearchTool,
+    LiteLLMModel,
+    Model,
+    ToolCallingAgent,
+)
+load_dotenv(override=True)
+login(os.getenv("HF_TOKEN"))
+append_answer_lock = threading.Lock()
+def parse_args():
+    parser = argparse.ArgumentParser()
+    parser.add_argument("--concurrency", type=int, default=8)
+    parser.add_argument("--model-id", type=str, default="o1")
+    parser.add_argument("--run-name", type=str, required=True)
+    parser.add_argument("--set-to-run", type=str, default="validation")
+    parser.add_argument("--use-open-models", type=bool, default=False)
+    parser.add_argument("--use-raw-dataset", action="store_true")
+    return parser.parse_args()
+### IMPORTANT: EVALUATION SWITCHES
+print("Make sure you deactivated any VPN like Tailscale, else some URLs will be blocked!")
+custom_role_conversions = {"tool-call": "assistant", "tool-response": "user"}
+user_agent = "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/119.0.0.0 Safari/537.36 Edg/119.0.0.0"
+BROWSER_CONFIG = {
+    "viewport_size": 1024 * 5,
+    "downloads_folder": "downloads_folder",
+    "request_kwargs": {
+        "headers": {"User-Agent": user_agent},
+        "timeout": 300,
+    },
+    "serpapi_key": os.getenv("SERPAPI_API_KEY"),
+}
+os.makedirs(f"./{BROWSER_CONFIG['downloads_folder']}", exist_ok=True)
+def create_agent_team(model: Model):
+    text_limit = 100000
+    ti_tool = TextInspectorTool(model, text_limit)
+    browser = SimpleTextBrowser(**BROWSER_CONFIG)
+    WEB_TOOLS = [
+        GoogleSearchTool(provider="serper"),
+        VisitTool(browser),
+        PageUpTool(browser),
+        PageDownTool(browser),
+        FinderTool(browser),
+        FindNextTool(browser),
+        ArchiveSearchTool(browser),
+        TextInspectorTool(model, text_limit),
+    ]
+    text_webbrowser_agent = ToolCallingAgent(
+        model=model,
+        tools=WEB_TOOLS,
+        max_steps=20,
+        verbosity_level=2,
+        planning_interval=4,
+        name="search_agent",
+        description="""A team member that will search the internet to answer your question.
+    Ask him for all your questions that require browsing the web.
+    Provide him as much context as possible, in particular if you need to search on a specific timeframe!
+    And don't hesitate to provide him with a complex search task, like finding a difference between two webpages.
+    Your request must be a real sentence, not a google search! Like "Find me this information (...)" rather than a few keywords.
+    """,
+        provide_run_summary=True,
+    )
+    text_webbrowser_agent.prompt_templates["managed_agent"]["task"] += """You can navigate to .txt online files.
+    If a non-html page is in another format, especially .pdf or a Youtube video, use tool 'inspect_file_as_text' to inspect it.
+    Additionally, if after some searching you find out that you need more information to answer the question, you can use `final_answer` with your request for clarification as argument to request for more information."""
+    manager_agent = CodeAgent(
+        model=model,
+        tools=[visualizer, ti_tool],
+        max_steps=12,
+        verbosity_level=2,
+        additional_authorized_imports=["*"],
+        planning_interval=4,
+        managed_agents=[text_webbrowser_agent],
+    )
+    return manager_agent
+def load_gaia_dataset(use_raw_dataset: bool, set_to_run: str) -> datasets.Dataset:
+    if not os.path.exists("data/gaia"):
+        if use_raw_dataset:
+            snapshot_download(
+                repo_id="gaia-benchmark/GAIA",
+                repo_type="dataset",
+                local_dir="data/gaia",
+                ignore_patterns=[".gitattributes", "README.md"],
+            )
+        else:
+            # WARNING: this dataset is gated: make sure you visit the repo to require access.
+            snapshot_download(
+                repo_id="smolagents/GAIA-annotated",
+                repo_type="dataset",
+                local_dir="data/gaia",
+                ignore_patterns=[".gitattributes", "README.md"],
+            )
+    def preprocess_file_paths(row):
+        if len(row["file_name"]) > 0:
+            row["file_name"] = f"data/gaia/{set_to_run}/" + row["file_name"]
+        return row
+    eval_ds = datasets.load_dataset(
+        "data/gaia/GAIA.py",
+        name="2023_all",
+        split=set_to_run,
+        # data_files={"validation": "validation/metadata.jsonl", "test": "test/metadata.jsonl"},
+    )
+    eval_ds = eval_ds.rename_columns({"Question": "question", "Final answer": "true_answer", "Level": "task"})
+    eval_ds = eval_ds.map(preprocess_file_paths)
+    return eval_ds
+def append_answer(entry: dict, jsonl_file: str) -> None:
+    jsonl_path = Path(jsonl_file)
+    jsonl_path.parent.mkdir(parents=True, exist_ok=True)
+    with append_answer_lock, open(jsonl_file, "a", encoding="utf-8") as fp:
+        fp.write(json.dumps(entry) + "\n")
+    assert jsonl_path.exists(), "File not found!"
+    print("Answer exported to file:", jsonl_path.resolve())
+def answer_single_question(
+    example: dict, model_id: str, answers_file: str, visual_inspection_tool: TextInspectorTool
+) -> None:
+    model_params: dict[str, Any] = {
+        "model_id": model_id,
+        "custom_role_conversions": custom_role_conversions,
+    }
+    if model_id == "o1":
+        model_params["reasoning_effort"] = "high"
+        model_params["max_completion_tokens"] = 8192
+    else:
+        model_params["max_tokens"] = 4096
+    model = LiteLLMModel(**model_params)
+    # model = InferenceClientModel(model_id="Qwen/Qwen3-32B", provider="novita", max_tokens=4096)
+    document_inspection_tool = TextInspectorTool(model, 100000)
+    agent = create_agent_team(model)
+    augmented_question = """You have one question to answer. It is paramount that you provide a correct answer.
+Give it all you can: I know for a fact that you have access to all the relevant tools to solve it and find the correct answer (the answer does exist).
+Failure or 'I cannot answer' or 'None found' will not be tolerated, success will be rewarded.
+Run verification steps if that's needed, you must make sure you find the correct answer! Here is the task:
+""" + example["question"]
+    if example["file_name"]:
+        if ".zip" in example["file_name"]:
+            prompt_use_files = "\n\nTo solve the task above, you will have to use these attached files:\n"
+            prompt_use_files += get_zip_description(
+                example["file_name"], example["question"], visual_inspection_tool, document_inspection_tool
+            )
+        else:
+            prompt_use_files = "\n\nTo solve the task above, you will have to use this attached file:\n"
+            prompt_use_files += get_single_file_description(
+                example["file_name"], example["question"], visual_inspection_tool, document_inspection_tool
+            )
+        augmented_question += prompt_use_files
+    start_time = datetime.now().strftime("%Y-%m-%d %H:%M:%S")
+    try:
+        # Run agent 🚀
+        final_result = agent.run(augmented_question)
+        agent_memory = agent.write_memory_to_messages()
+        final_result = prepare_response(augmented_question, agent_memory, reformulation_model=model)
+        output = str(final_result)
+        for memory_step in agent.memory.steps:
+            memory_step.model_input_messages = None
+        intermediate_steps = agent_memory
+        # Check for parsing errors which indicate the LLM failed to follow the required format
+        parsing_error = True if any(["AgentParsingError" in step for step in intermediate_steps]) else False
+        # check if iteration limit exceeded
+        iteration_limit_exceeded = True if "Agent stopped due to iteration limit or time limit." in output else False
+        raised_exception = False
+    except Exception as e:
+        print("Error on ", augmented_question, e)
+        output = None
+        intermediate_steps = []
+        parsing_error = False
+        iteration_limit_exceeded = False
+        exception = e
+        raised_exception = True
+    end_time = datetime.now().strftime("%Y-%m-%d %H:%M:%S")
+    token_counts_manager = agent.monitor.get_total_token_counts()
+    token_counts_web = list(agent.managed_agents.values())[0].monitor.get_total_token_counts()
+    total_token_counts = {
+        "input": token_counts_manager["input"] + token_counts_web["input"],
+        "output": token_counts_manager["output"] + token_counts_web["output"],
+    }
+    annotated_example = {
+        "agent_name": model.model_id,
+        "question": example["question"],
+        "augmented_question": augmented_question,
+        "prediction": output,
+        "intermediate_steps": intermediate_steps,
+        "parsing_error": parsing_error,
+        "iteration_limit_exceeded": iteration_limit_exceeded,
+        "agent_error": str(exception) if raised_exception else None,
+        "task": example["task"],
+        "task_id": example["task_id"],
+        "true_answer": example["true_answer"],
+        "start_time": start_time,
+        "end_time": end_time,
+        "token_counts": total_token_counts,
+    }
+    append_answer(annotated_example, answers_file)
+def get_examples_to_answer(answers_file: str, eval_ds: datasets.Dataset) -> list[dict]:
+    print(f"Loading answers from {answers_file}...")
+    try:
+        done_questions = pd.read_json(answers_file, lines=True)["question"].tolist()
+        print(f"Found {len(done_questions)} previous results!")
+    except Exception as e:
+        print("Error when loading records: ", e)
+        print("No usable records! ▶️ Starting new.")
+        done_questions = []
+    return [line for line in eval_ds.to_list() if line["question"] not in done_questions and line["file_name"]]
+def main():
+    args = parse_args()
+    print(f"Starting run with arguments: {args}")
+    eval_ds = load_gaia_dataset(args.use_raw_dataset, args.set_to_run)
+    print("Loaded evaluation dataset:")
+    print(pd.DataFrame(eval_ds)["task"].value_counts())
+    answers_file = f"output/{args.set_to_run}/{args.run_name}.jsonl"
+    tasks_to_run = get_examples_to_answer(answers_file, eval_ds)
+    with ThreadPoolExecutor(max_workers=args.concurrency) as exe:
+        futures = [
+            exe.submit(answer_single_question, example, args.model_id, answers_file, visualizer)
+            for example in tasks_to_run
+        ]
+        for f in tqdm(as_completed(futures), total=len(tasks_to_run), desc="Processing tasks"):
+            f.result()
+    # for example in tasks_to_run:
+    #     answer_single_question(example, args.model_id, answers_file, visualizer)
+    print("All tasks processed.")
+if __name__ == "__main__":
+    main()

examples/open_deep_research/visual_vs_text_browser.ipynb ADDED Viewed

	@@ -0,0 +1,359 @@

+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# Compare a text-based vs a vision-based browser\n",
+    "\n",
+    "Warning: this notebook is experimental, it probably won't work out of the box!"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "!pip install \"smolagents[litellm,toolkit]\" -q"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import datasets\n",
+    "\n",
+    "\n",
+    "eval_ds = datasets.load_dataset(\"gaia-benchmark/GAIA\", \"2023_all\")[\"validation\"]"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 3,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "to_keep = [\n",
+    "    \"What's the last line of the rhyme under the flavor\",\n",
+    "    'Of the authors (First M. Last) that worked on the paper \"Pie Menus or Linear Menus',\n",
+    "    \"In Series 9, Episode 11 of Doctor Who, the Doctor is trapped inside an ever-shifting maze. What is this location called in the official script for the episode? Give the setting exactly as it appears in the first scene heading.\",\n",
+    "    \"Which contributor to the version of OpenCV where support was added for the Mask-RCNN model has the same name as a former Chinese head of government when the names are transliterated to the Latin alphabet?\",\n",
+    "    \"The photograph in the Whitney Museum of American Art's collection with accession number 2022.128 shows a person holding a book. Which military unit did the author of this book join in 1813? Answer without using articles.\",\n",
+    "    \"I went to Virtue restaurant & bar in Chicago for my birthday on March 22, 2021 and the main course I had was delicious! Unfortunately, when I went back about a month later on April 21, it was no longer on the dinner menu.\",\n",
+    "    \"In Emily Midkiff's June 2014 article in a journal named for the one of Hreidmar's \",\n",
+    "    \"Under DDC 633 on Bielefeld University Library's BASE, as of 2020\",\n",
+    "    \"In the 2018 VSCode blog post on replit.com, what was the command they clicked on in the last video to remove extra lines?\",\n",
+    "    \"The Metropolitan Museum of Art has a portrait in its collection with an accession number of 29.100.5. Of the consecrators and co-consecrators\",\n",
+    "    \"In Nature journal's Scientific Reports conference proceedings from 2012, in the article that did not mention plasmons or plasmonics, what nano-compound is studied?\",\n",
+    "    'In the year 2022, and before December, what does \"R\" stand for in the three core policies of the type of content',\n",
+    "    \"Who nominated the only Featured Article on English Wikipedia about a dinosaur that was promoted in November 2016?\",\n",
+    "]\n",
+    "eval_ds = eval_ds.filter(lambda row: any([el in row[\"Question\"] for el in to_keep]))\n",
+    "eval_ds = eval_ds.rename_columns({\"Question\": \"question\", \"Final answer\": \"true_answer\", \"Level\": \"task\"})"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import os\n",
+    "\n",
+    "from dotenv import load_dotenv\n",
+    "from huggingface_hub import login\n",
+    "\n",
+    "\n",
+    "load_dotenv(override=True)\n",
+    "\n",
+    "login(os.getenv(\"HF_TOKEN\"))"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### Text browser"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from scripts.run_agents import answer_questions\n",
+    "from scripts.text_inspector_tool import TextInspectorTool\n",
+    "from scripts.text_web_browser import (\n",
+    "    ArchiveSearchTool,\n",
+    "    FinderTool,\n",
+    "    FindNextTool,\n",
+    "    NavigationalSearchTool,\n",
+    "    PageDownTool,\n",
+    "    PageUpTool,\n",
+    "    SearchInformationTool,\n",
+    "    VisitTool,\n",
+    ")\n",
+    "from scripts.visual_qa import VisualQAGPT4Tool\n",
+    "\n",
+    "from smolagents import CodeAgent, LiteLLMModel\n",
+    "\n",
+    "\n",
+    "proprietary_model = LiteLLMModel(model_id=\"gpt-4o\")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "### BUILD AGENTS & TOOLS\n",
+    "\n",
+    "WEB_TOOLS = [\n",
+    "    SearchInformationTool(),\n",
+    "    NavigationalSearchTool(),\n",
+    "    VisitTool(),\n",
+    "    PageUpTool(),\n",
+    "    PageDownTool(),\n",
+    "    FinderTool(),\n",
+    "    FindNextTool(),\n",
+    "    ArchiveSearchTool(),\n",
+    "]\n",
+    "\n",
+    "\n",
+    "surfer_agent = CodeAgent(\n",
+    "    model=proprietary_model,\n",
+    "    tools=WEB_TOOLS,\n",
+    "    max_steps=20,\n",
+    "    verbosity_level=2,\n",
+    ")\n",
+    "\n",
+    "results_text = answer_questions(\n",
+    "    eval_ds,\n",
+    "    surfer_agent,\n",
+    "    \"code_gpt4o_27-01_text\",\n",
+    "    reformulation_model=proprietary_model,\n",
+    "    output_folder=\"output_browsers\",\n",
+    "    visual_inspection_tool=VisualQAGPT4Tool(),\n",
+    "    text_inspector_tool=TextInspectorTool(proprietary_model, 40000),\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### Vision browser"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "!pip install helium -q"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from scripts.visual_qa import VisualQAGPT4Tool\n",
+    "\n",
+    "from smolagents import CodeAgent, LiteLLMModel, WebSearchTool\n",
+    "from smolagents.vision_web_browser import (\n",
+    "    close_popups,\n",
+    "    go_back,\n",
+    "    helium_instructions,\n",
+    "    initialize_agent,\n",
+    "    save_screenshot,\n",
+    "    search_item_ctrl_f,\n",
+    ")\n",
+    "\n",
+    "\n",
+    "proprietary_model = LiteLLMModel(model_id=\"gpt-4o\")\n",
+    "vision_browser_agent = initialize_agent(proprietary_model)\n",
+    "### BUILD AGENTS & TOOLS\n",
+    "\n",
+    "CodeAgent(\n",
+    "    tools=[WebSearchTool(), go_back, close_popups, search_item_ctrl_f],\n",
+    "    model=proprietary_model,\n",
+    "    additional_authorized_imports=[\"helium\"],\n",
+    "    step_callbacks=[save_screenshot],\n",
+    "    max_steps=20,\n",
+    "    verbosity_level=2,\n",
+    ")\n",
+    "\n",
+    "results_vision = answer_questions(\n",
+    "    eval_ds,\n",
+    "    vision_browser_agent,\n",
+    "    \"code_gpt4o_27-01_vision\",\n",
+    "    reformulation_model=proprietary_model,\n",
+    "    output_folder=\"output_browsers\",\n",
+    "    visual_inspection_tool=VisualQAGPT4Tool(),\n",
+    "    text_inspector_tool=TextInspectorTool(proprietary_model, 40000),\n",
+    "    postprompt=helium_instructions\n",
+    "    + \"Any web browser controls won't work on .pdf urls, rather use the tool 'inspect_file_as_text' to read them\",\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### Browser-use browser"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "!pip install browser-use lxml_html_clean -q\n",
+    "!playwright install"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import asyncio\n",
+    "\n",
+    "import nest_asyncio\n",
+    "\n",
+    "\n",
+    "nest_asyncio.apply()\n",
+    "\n",
+    "from browser_use import Agent\n",
+    "from dotenv import load_dotenv\n",
+    "from langchain_openai import ChatOpenAI\n",
+    "\n",
+    "\n",
+    "load_dotenv()\n",
+    "\n",
+    "\n",
+    "class BrowserUseAgent:\n",
+    "    logs = []\n",
+    "\n",
+    "    def write_inner_memory_from_logs(self, summary_mode):\n",
+    "        return self.results\n",
+    "\n",
+    "    def run(self, task, **kwargs):\n",
+    "        agent = Agent(\n",
+    "            task=task,\n",
+    "            llm=ChatOpenAI(model=\"gpt-4o\"),\n",
+    "        )\n",
+    "        self.results = asyncio.get_event_loop().run_until_complete(agent.run())\n",
+    "        return self.results.history[-1].result[0].extracted_content\n",
+    "\n",
+    "\n",
+    "browser_use_agent = BrowserUseAgent()\n",
+    "\n",
+    "results_browseruse = answer_questions(\n",
+    "    eval_ds,\n",
+    "    browser_use_agent,\n",
+    "    \"gpt-4o_27-01_browseruse\",\n",
+    "    reformulation_model=proprietary_model,\n",
+    "    output_folder=\"output_browsers\",\n",
+    "    visual_inspection_tool=VisualQAGPT4Tool(),\n",
+    "    text_inspector_tool=TextInspectorTool(proprietary_model, 40000),\n",
+    "    postprompt=\"\",\n",
+    "    run_simple=True,\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### Get results"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import pandas as pd\n",
+    "from scripts.gaia_scorer import question_scorer\n",
+    "\n",
+    "\n",
+    "results_vision, results_text, results_browseruse = (\n",
+    "    pd.DataFrame(results_vision),\n",
+    "    pd.DataFrame(results_text),\n",
+    "    pd.DataFrame(results_browseruse),\n",
+    ")\n",
+    "\n",
+    "results_vision[\"is_correct\"] = results_vision.apply(\n",
+    "    lambda x: question_scorer(x[\"prediction\"], x[\"true_answer\"]), axis=1\n",
+    ")\n",
+    "results_text[\"is_correct\"] = results_text.apply(lambda x: question_scorer(x[\"prediction\"], x[\"true_answer\"]), axis=1)\n",
+    "results_browseruse[\"is_correct\"] = results_browseruse.apply(\n",
+    "    lambda x: question_scorer(x[\"prediction\"], x[\"true_answer\"]), axis=1\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "results = pd.concat([results_vision, results_text, results_browseruse])\n",
+    "results.groupby(\"agent_name\")[\"is_correct\"].mean()"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "correct_vision_results = results_vision.loc[results_vision[\"is_correct\"]]\n",
+    "correct_vision_results"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "false_text_results = results_text.loc[~results_text[\"is_correct\"]]\n",
+    "false_text_results"
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "gaia",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.12.0"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 2
+}