Duibonduil commited on
Commit
279da32
·
verified ·
1 Parent(s): 0e79d34

Upload 4 files

Browse files
docs/source/en/tutorials/inspect_runs.md ADDED
@@ -0,0 +1,178 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Inspecting runs with OpenTelemetry
2
+
3
+ [[open-in-colab]]
4
+
5
+ > [!TIP]
6
+ > If you're new to building agents, make sure to first read the [intro to agents](../conceptual_guides/intro_agents) and the [guided tour of smolagents](../guided_tour).
7
+
8
+ ## Why log your agent runs?
9
+
10
+ Agent runs are complicated to debug.
11
+
12
+ Validating that a run went properly is hard, since agent workflows are [unpredictable by design](../conceptual_guides/intro_agents) (if they were predictable, you'd just be using good old code).
13
+
14
+ And inspecting a run is hard as well: multi-step agents tend to quickly fill a console with logs, and most of the errors are just "LLM dumb" kind of errors, from which the LLM auto-corrects in the next step by writing better code or tool calls.
15
+
16
+ So using instrumentation to record agent runs is necessary in production for later inspection and monitoring!
17
+
18
+ We've adopted the [OpenTelemetry](https://opentelemetry.io/) standard for instrumenting agent runs.
19
+
20
+ This means that you can just run some instrumentation code, then run your agents normally, and everything gets logged into your platform. Below are some examples of how to do this with different OpenTelemetry backends.
21
+
22
+ Here's how it then looks like on the platform:
23
+
24
+ <div class="flex justify-center">
25
+ <img src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/smolagents/inspect_run_phoenix.gif"/>
26
+ </div>
27
+
28
+
29
+ ## Setting up telemetry with Arize AI Phoenix
30
+ First install the required packages. Here we install [Phoenix by Arize AI](https://github.com/Arize-ai/phoenix) because that's a good solution to collect and inspect the logs, but there are other OpenTelemetry-compatible platforms that you could use for this collection & inspection part.
31
+
32
+ ```shell
33
+ pip install 'smolagents[telemetry,toolkit]'
34
+ ```
35
+
36
+ Then run the collector in the background.
37
+
38
+ ```shell
39
+ python -m phoenix.server.main serve
40
+ ```
41
+
42
+ Finally, set up `SmolagentsInstrumentor` to trace your agents and send the traces to Phoenix default endpoint.
43
+
44
+ ```python
45
+ from phoenix.otel import register
46
+ from openinference.instrumentation.smolagents import SmolagentsInstrumentor
47
+
48
+ register()
49
+ SmolagentsInstrumentor().instrument()
50
+ ```
51
+ Then you can run your agents!
52
+
53
+ ```py
54
+ from smolagents import (
55
+ CodeAgent,
56
+ ToolCallingAgent,
57
+ WebSearchTool,
58
+ VisitWebpageTool,
59
+ InferenceClientModel,
60
+ )
61
+
62
+ model = InferenceClientModel()
63
+
64
+ search_agent = ToolCallingAgent(
65
+ tools=[WebSearchTool(), VisitWebpageTool()],
66
+ model=model,
67
+ name="search_agent",
68
+ description="This is an agent that can do web search.",
69
+ )
70
+
71
+ manager_agent = CodeAgent(
72
+ tools=[],
73
+ model=model,
74
+ managed_agents=[search_agent],
75
+ )
76
+ manager_agent.run(
77
+ "If the US keeps its 2024 growth rate, how many years will it take for the GDP to double?"
78
+ )
79
+ ```
80
+ Voilà!
81
+ You can then navigate to `http://0.0.0.0:6006/projects/` to inspect your run!
82
+
83
+ <img src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/smolagents/inspect_run_phoenix.png">
84
+
85
+ You can see that the CodeAgent called its managed ToolCallingAgent (by the way, the managed agent could have been a CodeAgent as well) to ask it to run the web search for the U.S. 2024 growth rate. Then the managed agent returned its report and the manager agent acted upon it to calculate the economy doubling time! Sweet, isn't it?
86
+
87
+ ## Setting up telemetry with Langfuse
88
+
89
+ This part shows how to monitor and debug your Hugging Face **smolagents** with **Langfuse** using the `SmolagentsInstrumentor`.
90
+
91
+ > **What is Langfuse?** [Langfuse](https://langfuse.com) is an open-source platform for LLM engineering. It provides tracing and monitoring capabilities for AI agents, helping developers debug, analyze, and optimize their products. Langfuse integrates with various tools and frameworks via native integrations, OpenTelemetry, and SDKs.
92
+
93
+ ### Step 1: Install Dependencies
94
+
95
+ ```python
96
+ %pip install smolagents
97
+ %pip install opentelemetry-sdk opentelemetry-exporter-otlp openinference-instrumentation-smolagents
98
+ ```
99
+
100
+ ### Step 2: Set Up Environment Variables
101
+
102
+ Set your Langfuse API keys and configure the OpenTelemetry endpoint to send traces to Langfuse. Get your Langfuse API keys by signing up for [Langfuse Cloud](https://cloud.langfuse.com) or [self-hosting Langfuse](https://langfuse.com/self-hosting).
103
+
104
+ Also, add your [Hugging Face token](https://huggingface.co/settings/tokens) (`HF_TOKEN`) as an environment variable.
105
+
106
+ ```python
107
+ import os
108
+ import base64
109
+
110
+ LANGFUSE_PUBLIC_KEY="pk-lf-..."
111
+ LANGFUSE_SECRET_KEY="sk-lf-..."
112
+ LANGFUSE_AUTH=base64.b64encode(f"{LANGFUSE_PUBLIC_KEY}:{LANGFUSE_SECRET_KEY}".encode()).decode()
113
+
114
+ os.environ["OTEL_EXPORTER_OTLP_ENDPOINT"] = "https://cloud.langfuse.com/api/public/otel" # EU data region
115
+ # os.environ["OTEL_EXPORTER_OTLP_ENDPOINT"] = "https://us.cloud.langfuse.com/api/public/otel" # US data region
116
+ os.environ["OTEL_EXPORTER_OTLP_HEADERS"] = f"Authorization=Basic {LANGFUSE_AUTH}"
117
+
118
+ # your Hugging Face token
119
+ os.environ["HF_TOKEN"] = "hf_..."
120
+ ```
121
+
122
+ ### Step 3: Initialize the `SmolagentsInstrumentor`
123
+
124
+ Initialize the `SmolagentsInstrumentor` before your application code. Configure `tracer_provider` and add a span processor to export traces to Langfuse. `OTLPSpanExporter()` uses the endpoint and headers from the environment variables.
125
+
126
+
127
+ ```python
128
+ from opentelemetry.sdk.trace import TracerProvider
129
+
130
+ from openinference.instrumentation.smolagents import SmolagentsInstrumentor
131
+ from opentelemetry.exporter.otlp.proto.http.trace_exporter import OTLPSpanExporter
132
+ from opentelemetry.sdk.trace.export import SimpleSpanProcessor
133
+
134
+ trace_provider = TracerProvider()
135
+ trace_provider.add_span_processor(SimpleSpanProcessor(OTLPSpanExporter()))
136
+
137
+ SmolagentsInstrumentor().instrument(tracer_provider=trace_provider)
138
+ ```
139
+
140
+ ### Step 4: Run your smolagent
141
+
142
+ ```python
143
+ from smolagents import (
144
+ CodeAgent,
145
+ ToolCallingAgent,
146
+ WebSearchTool,
147
+ VisitWebpageTool,
148
+ InferenceClientModel,
149
+ )
150
+
151
+ model = InferenceClientModel(
152
+ model_id="deepseek-ai/DeepSeek-R1-Distill-Qwen-32B"
153
+ )
154
+
155
+ search_agent = ToolCallingAgent(
156
+ tools=[WebSearchTool(), VisitWebpageTool()],
157
+ model=model,
158
+ name="search_agent",
159
+ description="This is an agent that can do web search.",
160
+ )
161
+
162
+ manager_agent = CodeAgent(
163
+ tools=[],
164
+ model=model,
165
+ managed_agents=[search_agent],
166
+ )
167
+ manager_agent.run(
168
+ "How can Langfuse be used to monitor and improve the reasoning and decision-making of smolagents when they execute multi-step tasks, like dynamically adjusting a recipe based on user feedback or available ingredients?"
169
+ )
170
+ ```
171
+
172
+ ### Step 5: View Traces in Langfuse
173
+
174
+ After running the agent, you can view the traces generated by your smolagents application in [Langfuse](https://cloud.langfuse.com). You should see detailed steps of the LLM interactions, which can help you debug and optimize your AI agent.
175
+
176
+ ![smolagents example trace](https://langfuse.com/images/cookbook/integration-smolagents/smolagent_example_trace.png)
177
+
178
+ _[Public example trace in Langfuse](https://cloud.langfuse.com/project/cloramnkj0002jz088vzn1ja4/traces/ce5160f9bfd5a6cd63b07d2bfcec6f54?timestamp=2025-02-11T09%3A25%3A45.163Z&display=details)_
docs/source/en/tutorials/memory.md ADDED
@@ -0,0 +1,134 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # 📚 Manage your agent's memory
2
+
3
+ [[open-in-colab]]
4
+
5
+ In the end, an agent can be defined by simple components: it has tools, prompts.
6
+ And most importantly, it has a memory of past steps, drawing a history of planning, execution, and errors.
7
+
8
+ ### Replay your agent's memory
9
+
10
+ We propose several features to inspect a past agent run.
11
+
12
+ You can instrument the agent's run to display it in a great UI that lets you zoom in/out on specific steps, as highlighted in the [instrumentation guide](./inspect_runs).
13
+
14
+ You can also use `agent.replay()`, as follows:
15
+
16
+ After the agent has run:
17
+ ```py
18
+ from smolagents import InferenceClientModel, CodeAgent
19
+
20
+ agent = CodeAgent(tools=[], model=InferenceClientModel(), verbosity_level=0)
21
+
22
+ result = agent.run("What's the 20th Fibonacci number?")
23
+ ```
24
+
25
+ If you want to replay this last run, just use:
26
+ ```py
27
+ agent.replay()
28
+ ```
29
+
30
+ ### Dynamically change the agent's memory
31
+
32
+ Many advanced use cases require dynamic modification of the agent's memory.
33
+
34
+ You can access the agent's memory using:
35
+
36
+ ```py
37
+ from smolagents import ActionStep
38
+
39
+ system_prompt_step = agent.memory.system_prompt
40
+ print("The system prompt given to the agent was:")
41
+ print(system_prompt_step.system_prompt)
42
+
43
+ task_step = agent.memory.steps[0]
44
+ print("\n\nThe first task step was:")
45
+ print(task_step.task)
46
+
47
+ for step in agent.memory.steps:
48
+ if isinstance(step, ActionStep):
49
+ if step.error is not None:
50
+ print(f"\nStep {step.step_number} got this error:\n{step.error}\n")
51
+ else:
52
+ print(f"\nStep {step.step_number} got these observations:\n{step.observations}\n")
53
+ ```
54
+
55
+ Use `agent.memory.get_full_steps()` to get full steps as dictionaries.
56
+
57
+ You can also use step callbacks to dynamically change the agent's memory.
58
+
59
+ Step callbacks can access the `agent` itself in their arguments, so they can access any memory step as highlighted above, and change it if needed. For instance, let's say you are observing screenshots of each step performed by a web browser agent. You want to log the newest screenshot, and remove the images from ancient steps to save on token costs.
60
+
61
+ You could run something like the following.
62
+ _Note: this code is incomplete, some imports and object definitions have been removed for the sake of concision, visit [the original script](https://github.com/huggingface/smolagents/blob/main/src/smolagents/vision_web_browser.py) to get the full working code._
63
+
64
+ ```py
65
+ import helium
66
+ from PIL import Image
67
+ from io import BytesIO
68
+ from time import sleep
69
+
70
+ def update_screenshot(memory_step: ActionStep, agent: CodeAgent) -> None:
71
+ sleep(1.0) # Let JavaScript animations happen before taking the screenshot
72
+ driver = helium.get_driver()
73
+ latest_step = memory_step.step_number
74
+ for previous_memory_step in agent.memory.steps: # Remove previous screenshots from logs for lean processing
75
+ if isinstance(previous_memory_step, ActionStep) and previous_memory_step.step_number <= latest_step - 2:
76
+ previous_memory_step.observations_images = None
77
+ png_bytes = driver.get_screenshot_as_png()
78
+ image = Image.open(BytesIO(png_bytes))
79
+ memory_step.observations_images = [image.copy()]
80
+ ```
81
+
82
+ Then you should pass this function in the `step_callbacks` argument upon initialization of your agent:
83
+
84
+ ```py
85
+ CodeAgent(
86
+ tools=[WebSearchTool(), go_back, close_popups, search_item_ctrl_f],
87
+ model=model,
88
+ additional_authorized_imports=["helium"],
89
+ step_callbacks=[update_screenshot],
90
+ max_steps=20,
91
+ verbosity_level=2,
92
+ )
93
+ ```
94
+
95
+ Head to our [vision web browser code](https://github.com/huggingface/smolagents/blob/main/src/smolagents/vision_web_browser.py) to see the full working example.
96
+
97
+ ### Run agents one step at a time
98
+
99
+ This can be useful in case you have tool calls that take days: you can just run your agents step by step.
100
+ This will also let you update the memory on each step.
101
+
102
+ ```py
103
+ from smolagents import InferenceClientModel, CodeAgent, ActionStep, TaskStep
104
+
105
+ agent = CodeAgent(tools=[], model=InferenceClientModel(), verbosity_level=1)
106
+ agent.python_executor.send_tools({**agent.tools})
107
+ print(agent.memory.system_prompt)
108
+
109
+ task = "What is the 20th Fibonacci number?"
110
+
111
+ # You could modify the memory as needed here by inputting the memory of another agent.
112
+ # agent.memory.steps = previous_agent.memory.steps
113
+
114
+ # Let's start a new task!
115
+ agent.memory.steps.append(TaskStep(task=task, task_images=[]))
116
+
117
+ final_answer = None
118
+ step_number = 1
119
+ while final_answer is None and step_number <= 10:
120
+ memory_step = ActionStep(
121
+ step_number=step_number,
122
+ observations_images=[],
123
+ )
124
+ # Run one step.
125
+ final_answer = agent.step(memory_step)
126
+ agent.memory.steps.append(memory_step)
127
+ step_number += 1
128
+
129
+ # Change the memory as you please!
130
+ # For instance to update the latest step:
131
+ # agent.memory.steps[-1] = ...
132
+
133
+ print("The final answer is:", final_answer)
134
+ ```
docs/source/en/tutorials/secure_code_execution.md ADDED
@@ -0,0 +1,424 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Secure code execution
2
+
3
+ [[open-in-colab]]
4
+
5
+ > [!TIP]
6
+ > If you're new to building agents, make sure to first read the [intro to agents](../conceptual_guides/intro_agents) and the [guided tour of smolagents](../guided_tour).
7
+
8
+ ### Code agents
9
+
10
+ [Multiple](https://huggingface.co/papers/2402.01030) [research](https://huggingface.co/papers/2411.01747) [papers](https://huggingface.co/papers/2401.00812) have shown that having the LLM write its actions (the tool calls) in code is much better than the current standard format for tool calling, which is across the industry different shades of "writing actions as a JSON of tools names and arguments to use".
11
+
12
+ Why is code better? Well, because we crafted our code languages specifically to be great at expressing actions performed by a computer. If JSON snippets were a better way, this package would have been written in JSON snippets and the devil would be laughing at us.
13
+
14
+ Code is just a better way to express actions on a computer. It has better:
15
+ - **Composability:** could you nest JSON actions within each other, or define a set of JSON actions to re-use later, the same way you could just define a python function?
16
+ - **Object management:** how do you store the output of an action like `generate_image` in JSON?
17
+ - **Generality:** code is built to express simply anything you can have a computer do.
18
+ - **Representation in LLM training corpus:** why not leverage this benediction of the sky that plenty of quality actions have already been included in LLM training corpus?
19
+
20
+ This is illustrated on the figure below, taken from [Executable Code Actions Elicit Better LLM Agents](https://huggingface.co/papers/2402.01030).
21
+
22
+ <img src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/transformers/code_vs_json_actions.png">
23
+
24
+ This is why we put emphasis on proposing code agents, in this case python agents, which meant putting higher effort on building secure python interpreters.
25
+
26
+ ### Local code execution??
27
+
28
+ By default, the `CodeAgent` runs LLM-generated code in your environment.
29
+
30
+ This is inherently risky, LLM-generated code could be harmful to your environment.
31
+
32
+ Malicious code execution can occur in several ways:
33
+ - **Plain LLM error:** LLMs are still far from perfect and may unintentionally generate harmful commands while attempting to be helpful. While this risk is low, instances have been observed where an LLM attempted to execute potentially dangerous code.
34
+ - **Supply chain attack:** Running an untrusted or compromised LLM could expose a system to harmful code generation. While this risk is extremely low when using well-known models on secure inference infrastructure, it remains a theoretical possibility.
35
+ - **Prompt injection:** an agent browsing the web could arrive on a malicious website that contains harmful instructions, thus injecting an attack into the agent's memory
36
+ - **Exploitation of publicly accessible agents:** Agents exposed to the public can be misused by malicious actors to execute harmful code. Attackers may craft adversarial inputs to exploit the agent's execution capabilities, leading to unintended consequences.
37
+ Once malicious code is executed, whether accidentally or intentionally, it can damage the file system, exploit local or cloud-based resources, abuse API services, and even compromise network security.
38
+
39
+ One could argue that on the [spectrum of agency](../conceptual_guides/intro_agents), code agents give much higher agency to the LLM on your system than other less agentic setups: this goes hand-in-hand with higher risk.
40
+
41
+ So you need to be very mindful of security.
42
+
43
+ To improve safety, we propose a range of measures that propose elevated levels of security, at a higher setup cost.
44
+
45
+ We advise you to keep in mind that no solution will be 100% safe.
46
+
47
+ <img src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/smolagents/code_execution_safety_diagram.png">
48
+
49
+ ### Our local Python executor
50
+
51
+ To add a first layer of security, code execution in `smolagents` is not performed by the vanilla Python interpreter.
52
+ We have re-built a more secure `LocalPythonExecutor` from the ground up.
53
+
54
+ To be precise, this interpreter works by loading the Abstract Syntax Tree (AST) from your Code and executes it operation by operation, making sure to always follow certain rules:
55
+ - By default, imports are disallowed unless they have been explicitly added to an authorization list by the user.
56
+ - Furthermore, access to submodules is disabled by default, and each must be explicitly authorized in the import list as well, or you can pass for instance `numpy.*` to allow both `numpy` and all its subpackags, like `numpy.random` or `numpy.a.b`.
57
+ - Note that some seemingly innocuous packages like `random` can give access to potentially harmful submodules, as in `random._os`.
58
+ - The total count of elementary operations processed is capped to prevent infinite loops and resource bloating.
59
+ - Any operation that has not been explicitly defined in our custom interpreter will raise an error.
60
+
61
+ You could try these safeguards as follows:
62
+
63
+ ```py
64
+ from smolagents.local_python_executor import LocalPythonExecutor
65
+
66
+ # Set up custom executor, authorize package "numpy"
67
+ custom_executor = LocalPythonExecutor(["numpy"])
68
+
69
+ # Utilisty for pretty printing errors
70
+ def run_capture_exception(command: str):
71
+ try:
72
+ custom_executor(harmful_command)
73
+ except Exception as e:
74
+ print("ERROR:\n", e)
75
+
76
+ # Undefined command just do not work
77
+ harmful_command="!echo Bad command"
78
+ run_capture_exception(harmful_command)
79
+ # >>> ERROR: invalid syntax (<unknown>, line 1)
80
+
81
+
82
+ # Imports like os will not be performed unless explicitly added to `additional_authorized_imports`
83
+ harmful_command="import os; exit_code = os.system("echo Bad command")"
84
+ run_capture_exception(harmful_command)
85
+ # >>> ERROR: Code execution failed at line 'import os' due to: InterpreterError: Import of os is not allowed. Authorized imports are: ['statistics', 'numpy', 'itertools', 'time', 'queue', 'collections', 'math', 'random', 're', 'datetime', 'stat', 'unicodedata']
86
+
87
+ # Even in authorized imports, potentially harmful packages will not be imported
88
+ harmful_command="import random; random._os.system('echo Bad command')"
89
+ run_capture_exception(harmful_command)
90
+ # >>> ERROR: Code execution failed at line 'random._os.system('echo Bad command')' due to: InterpreterError: Forbidden access to module: os
91
+
92
+ # Infinite loop are interrupted after N operations
93
+ harmful_command="""
94
+ while True:
95
+ pass
96
+ """
97
+ run_capture_exception(harmful_command)
98
+ # >>> ERROR: Code execution failed at line 'while True: pass' due to: InterpreterError: Maximum number of 1000000 iterations in While loop exceeded
99
+ ```
100
+
101
+ These safeguards make out interpreter is safer.
102
+ We have used it on a diversity of use cases, without ever observing any damage to the environment.
103
+
104
+ > [!WARNING]
105
+ > It's important to understand that no local python sandbox can ever be completely secure. While our interpreter provides significant safety improvements over the standard Python interpreter, it is still possible for a determined attacker or a fine-tuned malicious LLM to find vulnerabilities and potentially harm your environment.
106
+ >
107
+ > For example, if you've allowed packages like `Pillow` to process images, the LLM could generate code that creates thousands of large image files to fill your hard drive. Other advanced escape techniques might exploit deeper vulnerabilities in authorized packages.
108
+ >
109
+ > Running LLM-generated code in your local environment always carries some inherent risk. The only way to run LLM-generated code with truly robust security isolation is to use remote execution options like E2B or Docker, as detailed below.
110
+
111
+ The risk of a malicious attack is low when using well-known LLMs from trusted inference providers, but it is not zero.
112
+ For high-security applications or when using less trusted models, you should consider using a remote execution sandbox.
113
+
114
+ ## Sandbox approaches for secure code execution
115
+
116
+ When working with AI agents that execute code, security is paramount. There are two main approaches to sandboxing code execution in smolagents, each with different security properties and capabilities:
117
+
118
+
119
+ ![Sandbox approaches comparison](https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/smolagents/remote_execution.png)
120
+
121
+ 1. **Running individual code snippets in a sandbox**: This approach (left side of diagram) only executes the agent-generated Python code snippets in a sandbox while keeping the rest of the agentic system in your local environment. It's simpler to set up using `executor_type="e2b"` or `executor_type="docker"`, but it doesn't support multi-agents and still requires passing state data between your environment and the sandbox.
122
+
123
+ 2. **Running the entire agentic system in a sandbox**: This approach (right side of diagram) runs the entire agentic system, including the agent, model, and tools, within a sandbox environment. This provides better isolation but requires more manual setup and may require passing sensitive credentials (like API keys) to the sandbox environment.
124
+
125
+ This guide describes how to set up and use both types of sandbox approaches for your agent applications.
126
+
127
+ ### E2B setup
128
+
129
+ #### Installation
130
+
131
+ 1. Create an E2B account at [e2b.dev](https://e2b.dev)
132
+ 2. Install the required packages:
133
+ ```bash
134
+ pip install 'smolagents[e2b]'
135
+ ```
136
+
137
+ #### Running your agent in E2B: quick start
138
+
139
+ We provide a simple way to use an E2B Sandbox: simply add `executor_type="e2b"` to the agent initialization, as follows:
140
+
141
+ ```py
142
+ from smolagents import InferenceClientModel, CodeAgent
143
+
144
+ agent = CodeAgent(model=InferenceClientModel(), tools=[], executor_type="e2b")
145
+
146
+ with agent:
147
+ agent.run("Can you give me the 100th Fibonacci number?")
148
+ ```
149
+
150
+ > [!TIP]
151
+ > Using the agent as a context manager (with the `with` statement) ensures that the E2B sandbox is cleaned up immediately after the agent completes its task.
152
+ > Alternatively, you can manually call the agent's `cleanup()` method.
153
+
154
+ This solution send the agent state to the server at the start of each `agent.run()`.
155
+ Then the models are called from the local environment, but the generated code will be sent to the sandbox for execution, and only the output will be returned.
156
+
157
+ This is illustrated in the figure below.
158
+
159
+ <p align="center">
160
+ <img src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/smolagents/sandboxed_execution.png" alt="sandboxed code execution" width=60% max-width=500px>
161
+ </p>
162
+
163
+ However, since any call to a [managed agent](../examples/multiagents) would require model calls, since we do not transfer secrets to the remote sandbox, the model call would lack credentials.
164
+ Hence this solution does not work (yet) with more complicated multi-agent setups.
165
+
166
+ #### Running your agent in E2B: multi-agents
167
+
168
+ To use multi-agents in an E2B sandbox, you need to run your agents completely from within E2B.
169
+
170
+ Here is how to do it:
171
+
172
+ ```python
173
+ from e2b_code_interpreter import Sandbox
174
+ import os
175
+
176
+ # Create the sandbox
177
+ sandbox = Sandbox()
178
+
179
+ # Install required packages
180
+ sandbox.commands.run("pip install smolagents")
181
+
182
+ def run_code_raise_errors(sandbox, code: str, verbose: bool = False) -> str:
183
+ execution = sandbox.run_code(
184
+ code,
185
+ envs={'HF_TOKEN': os.getenv('HF_TOKEN')}
186
+ )
187
+ if execution.error:
188
+ execution_logs = "\n".join([str(log) for log in execution.logs.stdout])
189
+ logs = execution_logs
190
+ logs += execution.error.traceback
191
+ raise ValueError(logs)
192
+ return "\n".join([str(log) for log in execution.logs.stdout])
193
+
194
+ # Define your agent application
195
+ agent_code = """
196
+ import os
197
+ from smolagents import CodeAgent, InferenceClientModel
198
+
199
+ # Initialize the agents
200
+ agent = CodeAgent(
201
+ model=InferenceClientModel(token=os.getenv("HF_TOKEN"), provider="together"),
202
+ tools=[],
203
+ name="coder_agent",
204
+ description="This agent takes care of your difficult algorithmic problems using code."
205
+ )
206
+
207
+ manager_agent = CodeAgent(
208
+ model=InferenceClientModel(token=os.getenv("HF_TOKEN"), provider="together"),
209
+ tools=[],
210
+ managed_agents=[agent],
211
+ )
212
+
213
+ # Run the agent
214
+ response = manager_agent.run("What's the 20th Fibonacci number?")
215
+ print(response)
216
+ """
217
+
218
+ # Run the agent code in the sandbox
219
+ execution_logs = run_code_raise_errors(sandbox, agent_code)
220
+ print(execution_logs)
221
+ ```
222
+
223
+ ### Docker setup
224
+
225
+ #### Installation
226
+
227
+ 1. [Install Docker on your system](https://docs.docker.com/get-started/get-docker/)
228
+ 2. Install the required packages:
229
+ ```bash
230
+ pip install 'smolagents[docker]'
231
+ ```
232
+
233
+ #### Running your agent in Docker: quick start
234
+
235
+ Similar to the E2B Sandbox above, to quickly get started with Docker, simply add `executor_type="docker"` to the agent initialization, like:
236
+
237
+ ```py
238
+ from smolagents import InferenceClientModel, CodeAgent
239
+
240
+ agent = CodeAgent(model=InferenceClientModel(), tools=[], executor_type="docker")
241
+
242
+ with agent:
243
+ agent.run("Can you give me the 100th Fibonacci number?")
244
+ ```
245
+
246
+ > [!TIP]
247
+ > Using the agent as a context manager (with the `with` statement) ensures that the Docker container is cleaned immediately after the agent completes its task.
248
+ > Alternatively, you can manually call the agent's `cleanup()` method.
249
+
250
+ #### Advanced docker usage
251
+
252
+ If you want to run multi-agent systems in Docker, you'll need to setup a custom interpreter in a sandbox.
253
+
254
+ Here is how to setup the a Dockerfile:
255
+
256
+ ```dockerfile
257
+ FROM python:3.10-bullseye
258
+
259
+ # Install build dependencies
260
+ RUN apt-get update && \
261
+ apt-get install -y --no-install-recommends \
262
+ build-essential \
263
+ python3-dev && \
264
+ pip install --no-cache-dir --upgrade pip && \
265
+ pip install --no-cache-dir smolagents && \
266
+ apt-get clean && \
267
+ rm -rf /var/lib/apt/lists/*
268
+
269
+ # Set working directory
270
+ WORKDIR /app
271
+
272
+ # Run with limited privileges
273
+ USER nobody
274
+
275
+ # Default command
276
+ CMD ["python", "-c", "print('Container ready')"]
277
+ ```
278
+
279
+ Create a sandbox manager to run code:
280
+
281
+ ```python
282
+ import docker
283
+ import os
284
+ from typing import Optional
285
+
286
+ class DockerSandbox:
287
+ def __init__(self):
288
+ self.client = docker.from_env()
289
+ self.container = None
290
+
291
+ def create_container(self):
292
+ try:
293
+ image, build_logs = self.client.images.build(
294
+ path=".",
295
+ tag="agent-sandbox",
296
+ rm=True,
297
+ forcerm=True,
298
+ buildargs={},
299
+ # decode=True
300
+ )
301
+ except docker.errors.BuildError as e:
302
+ print("Build error logs:")
303
+ for log in e.build_log:
304
+ if 'stream' in log:
305
+ print(log['stream'].strip())
306
+ raise
307
+
308
+ # Create container with security constraints and proper logging
309
+ self.container = self.client.containers.run(
310
+ "agent-sandbox",
311
+ command="tail -f /dev/null", # Keep container running
312
+ detach=True,
313
+ tty=True,
314
+ mem_limit="512m",
315
+ cpu_quota=50000,
316
+ pids_limit=100,
317
+ security_opt=["no-new-privileges"],
318
+ cap_drop=["ALL"],
319
+ environment={
320
+ "HF_TOKEN": os.getenv("HF_TOKEN")
321
+ },
322
+ )
323
+
324
+ def run_code(self, code: str) -> Optional[str]:
325
+ if not self.container:
326
+ self.create_container()
327
+
328
+ # Execute code in container
329
+ exec_result = self.container.exec_run(
330
+ cmd=["python", "-c", code],
331
+ user="nobody"
332
+ )
333
+
334
+ # Collect all output
335
+ return exec_result.output.decode() if exec_result.output else None
336
+
337
+
338
+ def cleanup(self):
339
+ if self.container:
340
+ try:
341
+ self.container.stop()
342
+ except docker.errors.NotFound:
343
+ # Container already removed, this is expected
344
+ pass
345
+ except Exception as e:
346
+ print(f"Error during cleanup: {e}")
347
+ finally:
348
+ self.container = None # Clear the reference
349
+
350
+ # Example usage:
351
+ sandbox = DockerSandbox()
352
+
353
+ try:
354
+ # Define your agent code
355
+ agent_code = """
356
+ import os
357
+ from smolagents import CodeAgent, InferenceClientModel
358
+
359
+ # Initialize the agent
360
+ agent = CodeAgent(
361
+ model=InferenceClientModel(token=os.getenv("HF_TOKEN"), provider="together"),
362
+ tools=[]
363
+ )
364
+
365
+ # Run the agent
366
+ response = agent.run("What's the 20th Fibonacci number?")
367
+ print(response)
368
+ """
369
+
370
+ # Run the code in the sandbox
371
+ output = sandbox.run_code(agent_code)
372
+ print(output)
373
+
374
+ finally:
375
+ sandbox.cleanup()
376
+ ```
377
+
378
+ ### Best practices for sandboxes
379
+
380
+ These key practices apply to both E2B and Docker sandboxes:
381
+
382
+ - Resource management
383
+ - Set memory and CPU limits
384
+ - Implement execution timeouts
385
+ - Monitor resource usage
386
+ - Security
387
+ - Run with minimal privileges
388
+ - Disable unnecessary network access
389
+ - Use environment variables for secrets
390
+ - Environment
391
+ - Keep dependencies minimal
392
+ - Use fixed package versions
393
+ - If you use base images, update them regularly
394
+
395
+ - Cleanup
396
+ - Always ensure proper cleanup of resources, especially for Docker containers, to avoid having dangling containers eating up resources.
397
+
398
+ ✨ By following these practices and implementing proper cleanup procedures, you can ensure your agent runs safely and efficiently in a sandboxed environment.
399
+
400
+ ## Comparing security approaches
401
+
402
+ As illustrated in the diagram earlier, both sandboxing approaches have different security implications:
403
+
404
+ ### Approach 1: Running just the code snippets in a sandbox
405
+ - **Pros**:
406
+ - Easier to set up with a simple parameter (`executor_type="e2b"` or `executor_type="docker"`)
407
+ - No need to transfer API keys to the sandbox
408
+ - Better protection for your local environment
409
+ - **Cons**:
410
+ - Doesn't support multi-agents (managed agents)
411
+ - Still requires transferring state between your environment and the sandbox
412
+ - Limited to specific code execution
413
+
414
+ ### Approach 2: Running the entire agentic system in a sandbox
415
+ - **Pros**:
416
+ - Supports multi-agents
417
+ - Complete isolation of the entire agent system
418
+ - More flexible for complex agent architectures
419
+ - **Cons**:
420
+ - Requires more manual setup
421
+ - May require transferring sensitive API keys to the sandbox
422
+ - Potentially higher latency due to more complex operations
423
+
424
+ Choose the approach that best balances your security needs with your application's requirements. For most applications with simpler agent architectures, Approach 1 provides a good balance of security and ease of use. For more complex multi-agent systems where you need full isolation, Approach 2, while more involved to set up, offers better security guarantees.
docs/source/en/tutorials/tools.md ADDED
@@ -0,0 +1,326 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Tools
2
+
3
+ [[open-in-colab]]
4
+
5
+ Here, we're going to see advanced tool usage.
6
+
7
+ > [!TIP]
8
+ > If you're new to building agents, make sure to first read the [intro to agents](../conceptual_guides/intro_agents) and the [guided tour of smolagents](../guided_tour).
9
+
10
+
11
+ ### What is a tool, and how to build one?
12
+
13
+ A tool is mostly a function that an LLM can use in an agentic system.
14
+
15
+ But to use it, the LLM will need to be given an API: name, tool description, input types and descriptions, output type.
16
+
17
+ So it cannot be only a function. It should be a class.
18
+
19
+ So at core, the tool is a class that wraps a function with metadata that helps the LLM understand how to use it.
20
+
21
+ Here's how it looks:
22
+
23
+ ```python
24
+ from smolagents import Tool
25
+
26
+ class HFModelDownloadsTool(Tool):
27
+ name = "model_download_counter"
28
+ description = """
29
+ This is a tool that returns the most downloaded model of a given task on the Hugging Face Hub.
30
+ It returns the name of the checkpoint."""
31
+ inputs = {
32
+ "task": {
33
+ "type": "string",
34
+ "description": "the task category (such as text-classification, depth-estimation, etc)",
35
+ }
36
+ }
37
+ output_type = "string"
38
+
39
+ def forward(self, task: str):
40
+ from huggingface_hub import list_models
41
+
42
+ model = next(iter(list_models(filter=task, sort="downloads", direction=-1)))
43
+ return model.id
44
+
45
+ model_downloads_tool = HFModelDownloadsTool()
46
+ ```
47
+
48
+ The custom tool subclasses [`Tool`] to inherit useful methods. The child class also defines:
49
+ - An attribute `name`, which corresponds to the name of the tool itself. The name usually describes what the tool does. Since the code returns the model with the most downloads for a task, let's name it `model_download_counter`.
50
+ - An attribute `description` is used to populate the agent's system prompt.
51
+ - An `inputs` attribute, which is a dictionary with keys `"type"` and `"description"`. It contains information that helps the Python interpreter make educated choices about the input.
52
+ - An `output_type` attribute, which specifies the output type. The types for both `inputs` and `output_type` should be [Pydantic formats](https://docs.pydantic.dev/latest/concepts/json_schema/#generating-json-schema), they can be either of these: [`~AUTHORIZED_TYPES`].
53
+ - A `forward` method which contains the inference code to be executed.
54
+
55
+ And that's all it needs to be used in an agent!
56
+
57
+ There's another way to build a tool. In the [guided_tour](../guided_tour), we implemented a tool using the `@tool` decorator. The [`tool`] decorator is the recommended way to define simple tools, but sometimes you need more than this: using several methods in a class for more clarity, or using additional class attributes.
58
+
59
+ In this case, you can build your tool by subclassing [`Tool`] as described above.
60
+
61
+ ### Share your tool to the Hub
62
+
63
+ You can share your custom tool to the Hub as a Space repository by calling [`~Tool.push_to_hub`] on the tool. Make sure you've created a repository for it on the Hub and are using a token with read access.
64
+
65
+ ```python
66
+ model_downloads_tool.push_to_hub("{your_username}/hf-model-downloads", token="<YOUR_HUGGINGFACEHUB_API_TOKEN>")
67
+ ```
68
+
69
+ For the push to Hub to work, your tool will need to respect some rules:
70
+ - All methods are self-contained, e.g. use variables that come either from their args.
71
+ - As per the above point, **all imports should be defined directly within the tool's functions**, else you will get an error when trying to call [`~Tool.save`] or [`~Tool.push_to_hub`] with your custom tool.
72
+ - If you subclass the `__init__` method, you can give it no other argument than `self`. This is because arguments set during a specific tool instance's initialization are hard to track, which prevents from sharing them properly to the hub. And anyway, the idea of making a specific class is that you can already set class attributes for anything you need to hard-code (just set `your_variable=(...)` directly under the `class YourTool(Tool):` line). And of course you can still create a class attribute anywhere in your code by assigning stuff to `self.your_variable`.
73
+
74
+
75
+ Once your tool is pushed to Hub, you can visualize it. [Here](https://huggingface.co/spaces/m-ric/hf-model-downloads) is the `model_downloads_tool` that I've pushed. It has a nice gradio interface.
76
+
77
+ When diving into the tool files, you can find that all the tool's logic is under [tool.py](https://huggingface.co/spaces/m-ric/hf-model-downloads/blob/main/tool.py). That is where you can inspect a tool shared by someone else.
78
+
79
+ Then you can load the tool with [`load_tool`] or create it with [`~Tool.from_hub`] and pass it to the `tools` parameter in your agent.
80
+ Since running tools means running custom code, you need to make sure you trust the repository, thus we require to pass `trust_remote_code=True` to load a tool from the Hub.
81
+
82
+ ```python
83
+ from smolagents import load_tool, CodeAgent
84
+
85
+ model_download_tool = load_tool(
86
+ "{your_username}/hf-model-downloads",
87
+ trust_remote_code=True
88
+ )
89
+ ```
90
+
91
+ ### Use tools from an MCP server
92
+
93
+ Our `MCPClient` allows you to load tools from an MCP server, and gives you full control over the connection and tool management:
94
+
95
+ For stdio-based MCP servers:
96
+ ```python
97
+ from smolagents import MCPClient, CodeAgent
98
+ from mcp import StdioServerParameters
99
+ import os
100
+
101
+ server_parameters = StdioServerParameters(
102
+ command="uvx", # Using uvx ensures dependencies are available
103
+ args=["--quiet", "[email protected]"],
104
+ env={"UV_PYTHON": "3.12", **os.environ},
105
+ )
106
+
107
+ with MCPClient(server_parameters) as tools:
108
+ agent = CodeAgent(tools=tools, model=model, add_base_tools=True)
109
+ agent.run("Please find the latest research on COVID-19 treatment.")
110
+ ```
111
+
112
+ For Streamable HTTP-based MCP servers:
113
+ ```python
114
+ from smolagents import MCPClient, CodeAgent
115
+
116
+ with MCPClient({"url": "http://127.0.0.1:8000/mcp", "transport": "streamable-http"}) as tools:
117
+ agent = CodeAgent(tools=tools, model=model, add_base_tools=True)
118
+ agent.run("Please find a remedy for hangover.")
119
+ ```
120
+
121
+ You can also manually manage the connection lifecycle with the try...finally pattern:
122
+
123
+ ```python
124
+ from smolagents import MCPClient, CodeAgent
125
+ from mcp import StdioServerParameters
126
+ import os
127
+
128
+ # Initialize server parameters
129
+ server_parameters = StdioServerParameters(
130
+ command="uvx",
131
+ args=["--quiet", "[email protected]"],
132
+ env={"UV_PYTHON": "3.12", **os.environ},
133
+ )
134
+
135
+ # Manually manage the connection
136
+ try:
137
+ mcp_client = MCPClient(server_parameters)
138
+ tools = mcp_client.get_tools()
139
+
140
+ # Use the tools with your agent
141
+ agent = CodeAgent(tools=tools, model=model, add_base_tools=True)
142
+ result = agent.run("What are the recent therapeutic approaches for Alzheimer's disease?")
143
+
144
+ # Process the result as needed
145
+ print(f"Agent response: {result}")
146
+ finally:
147
+ # Always ensure the connection is properly closed
148
+ mcp_client.disconnect()
149
+ ```
150
+
151
+ You can also connect to multiple MCP servers at once by passing a list of server parameters:
152
+ ```python
153
+ from smolagents import MCPClient, CodeAgent
154
+ from mcp import StdioServerParameters
155
+ import os
156
+
157
+ server_params1 = StdioServerParameters(
158
+ command="uvx",
159
+ args=["--quiet", "[email protected]"],
160
+ env={"UV_PYTHON": "3.12", **os.environ},
161
+ )
162
+
163
+ server_params2 = {"url": "http://127.0.0.1:8000/sse"}
164
+
165
+ with MCPClient([server_params1, server_params2]) as tools:
166
+ agent = CodeAgent(tools=tools, model=model, add_base_tools=True)
167
+ agent.run("Please analyze the latest research and suggest remedies for headaches.")
168
+ ```
169
+
170
+ > [!WARNING]
171
+ > **Security Warning:** Using MCP servers comes with security risks:
172
+ > - **Trust is essential:** Always verify the source and integrity of any MCP server before connecting to it, especially for production environments. Malicious servers can execute harmful code on your machine.
173
+ > - **Stdio-based MCP servers** will always execute code on your machine (that's their intended functionality).
174
+ > - **Streamable HTTP-based MCP servers** while the remote MCP servers will not be able to execute code on your machine, still proceed with caution.
175
+
176
+ ### Import a Space as a tool
177
+
178
+ You can directly import a Gradio Space from the Hub as a tool using the [`Tool.from_space`] method!
179
+
180
+ You only need to provide the id of the Space on the Hub, its name, and a description that will help your agent understand what the tool does. Under the hood, this will use [`gradio-client`](https://pypi.org/project/gradio-client/) library to call the Space.
181
+
182
+ For instance, let's import the [FLUX.1-dev](https://huggingface.co/black-forest-labs/FLUX.1-dev) Space from the Hub and use it to generate an image.
183
+
184
+ ```python
185
+ image_generation_tool = Tool.from_space(
186
+ "black-forest-labs/FLUX.1-schnell",
187
+ name="image_generator",
188
+ description="Generate an image from a prompt"
189
+ )
190
+
191
+ image_generation_tool("A sunny beach")
192
+ ```
193
+ And voilà, here's your image! 🏖️
194
+
195
+ <img src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/transformers/sunny_beach.webp">
196
+
197
+ Then you can use this tool just like any other tool. For example, let's improve the prompt `a rabbit wearing a space suit` and generate an image of it. This example also shows how you can pass additional arguments to the agent.
198
+
199
+ ```python
200
+ from smolagents import CodeAgent, InferenceClientModel
201
+
202
+ model = InferenceClientModel(model_id="Qwen/Qwen2.5-Coder-32B-Instruct")
203
+ agent = CodeAgent(tools=[image_generation_tool], model=model)
204
+
205
+ agent.run(
206
+ "Improve this prompt, then generate an image of it.", additional_args={'user_prompt': 'A rabbit wearing a space suit'}
207
+ )
208
+ ```
209
+
210
+ ```text
211
+ === Agent thoughts:
212
+ improved_prompt could be "A bright blue space suit wearing rabbit, on the surface of the moon, under a bright orange sunset, with the Earth visible in the background"
213
+
214
+ Now that I have improved the prompt, I can use the image generator tool to generate an image based on this prompt.
215
+ >>> Agent is executing the code below:
216
+ image = image_generator(prompt="A bright blue space suit wearing rabbit, on the surface of the moon, under a bright orange sunset, with the Earth visible in the background")
217
+ final_answer(image)
218
+ ```
219
+
220
+ <img src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/transformers/rabbit_spacesuit_flux.webp">
221
+
222
+ How cool is this? 🤩
223
+
224
+ ### Use LangChain tools
225
+
226
+ We love Langchain and think it has a very compelling suite of tools.
227
+ To import a tool from LangChain, use the `from_langchain()` method.
228
+
229
+ Here is how you can use it to recreate the intro's search result using a LangChain web search tool.
230
+ This tool will need `pip install langchain google-search-results -q` to work properly.
231
+ ```python
232
+ from langchain.agents import load_tools
233
+
234
+ search_tool = Tool.from_langchain(load_tools(["serpapi"])[0])
235
+
236
+ agent = CodeAgent(tools=[search_tool], model=model)
237
+
238
+ agent.run("How many more blocks (also denoted as layers) are in BERT base encoder compared to the encoder from the architecture proposed in Attention is All You Need?")
239
+ ```
240
+
241
+ ### Manage your agent's toolbox
242
+
243
+ You can manage an agent's toolbox by adding or replacing a tool in attribute `agent.tools`, since it is a standard dictionary.
244
+
245
+ Let's add the `model_download_tool` to an existing agent initialized with only the default toolbox.
246
+
247
+ ```python
248
+ from smolagents import InferenceClientModel
249
+
250
+ model = InferenceClientModel(model_id="Qwen/Qwen2.5-Coder-32B-Instruct")
251
+
252
+ agent = CodeAgent(tools=[], model=model, add_base_tools=True)
253
+ agent.tools[model_download_tool.name] = model_download_tool
254
+ ```
255
+ Now we can leverage the new tool:
256
+
257
+ ```python
258
+ agent.run(
259
+ "Can you give me the name of the model that has the most downloads in the 'text-to-video' task on the Hugging Face Hub but reverse the letters?"
260
+ )
261
+ ```
262
+
263
+
264
+ > [!TIP]
265
+ > Beware of not adding too many tools to an agent: this can overwhelm weaker LLM engines.
266
+
267
+
268
+ ### Use a collection of tools
269
+
270
+ You can leverage tool collections by using [`ToolCollection`]. It supports loading either a collection from the Hub or an MCP server tools.
271
+
272
+
273
+ #### Tool Collection from any MCP server
274
+
275
+ Leverage tools from the hundreds of MCP servers available on [glama.ai](https://glama.ai/mcp/servers) or [smithery.ai](https://smithery.ai/).
276
+
277
+ The MCP servers tools can be loaded with [`ToolCollection.from_mcp`].
278
+
279
+ > [!WARNING]
280
+ > **Security Warning:** The same security warnings mentioned for `MCPClient` apply when using `MCPClient` directly.
281
+
282
+
283
+ For stdio-based MCP servers, pass the server parameters as an instance of `mcp.StdioServerParameters`:
284
+ ```py
285
+ from smolagents import ToolCollection, CodeAgent
286
+ from mcp import StdioServerParameters
287
+
288
+ server_parameters = StdioServerParameters(
289
+ command="uvx",
290
+ args=["--quiet", "[email protected]"],
291
+ env={"UV_PYTHON": "3.12", **os.environ},
292
+ )
293
+
294
+ with ToolCollection.from_mcp(server_parameters, trust_remote_code=True) as tool_collection:
295
+ agent = CodeAgent(tools=[*tool_collection.tools], model=model, add_base_tools=True)
296
+ agent.run("Please find a remedy for hangover.")
297
+ ```
298
+
299
+ For Streamable HTTP-based MCP servers, simply pass a dict with parameters to `mcp.client.streamable_http.streamablehttp_client` and add the key `transport` with the value `"streamable-http"`:
300
+ ```py
301
+ from smolagents import ToolCollection, CodeAgent
302
+
303
+ with ToolCollection.from_mcp({"url": "http://127.0.0.1:8000/mcp", "transport": "streamable-http"}, trust_remote_code=True) as tool_collection:
304
+ agent = CodeAgent(tools=[*tool_collection.tools], add_base_tools=True)
305
+ agent.run("Please find a remedy for hangover.")
306
+ ```
307
+
308
+ #### Tool Collection from a collection in the Hub
309
+
310
+ You can leverage it with the slug of the collection you want to use.
311
+ Then pass them as a list to initialize your agent, and start using them!
312
+
313
+ ```py
314
+ from smolagents import ToolCollection, CodeAgent
315
+
316
+ image_tool_collection = ToolCollection.from_hub(
317
+ collection_slug="huggingface-tools/diffusion-tools-6630bb19a942c2306a2cdb6f",
318
+ token="<YOUR_HUGGINGFACEHUB_API_TOKEN>"
319
+ )
320
+ agent = CodeAgent(tools=[*image_tool_collection.tools], model=model, add_base_tools=True)
321
+
322
+ agent.run("Please draw me a picture of rivers and lakes.")
323
+ ```
324
+
325
+ To speed up the start, tools are loaded only if called by the agent.
326
+