Duibonduil commited on
Commit
a3d25b1
·
verified ·
1 Parent(s): bc17268

Upload 2 files

Browse files
docs/source/en/reference/models.md ADDED
@@ -0,0 +1,230 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Models
2
+
3
+ <Tip warning={true}>
4
+
5
+ Smolagents is an experimental API which is subject to change at any time. Results returned by the agents
6
+ can vary as the APIs or underlying models are prone to change.
7
+
8
+ </Tip>
9
+
10
+ To learn more about agents and tools make sure to read the [introductory guide](../index). This page
11
+ contains the API docs for the underlying classes.
12
+
13
+ ## Models
14
+
15
+ ### Your custom Model
16
+
17
+ You're free to create and use your own models to power your agent.
18
+
19
+ You could subclass the base `Model` class to create a model for your agent.
20
+ The main criteria is to subclass the `generate` method, with these two criteria:
21
+ 1. It follows the [messages format](./chat_templating) (`List[Dict[str, str]]`) for its input `messages`, and it returns an object with a `.content` attribute.
22
+ 2. It stops generating outputs at the sequences passed in the argument `stop_sequences`.
23
+
24
+ For defining your LLM, you can make a `CustomModel` class that inherits from the base `Model` class.
25
+ It should have a generate method that takes a list of [messages](./chat_templating) and returns an object with a .content attribute containing the text. The `generate` method also needs to accept a `stop_sequences` argument that indicates when to stop generating.
26
+
27
+ ```python
28
+ from huggingface_hub import login, InferenceClient
29
+
30
+ login("<YOUR_HUGGINGFACEHUB_API_TOKEN>")
31
+
32
+ model_id = "meta-llama/Llama-3.3-70B-Instruct"
33
+
34
+ client = InferenceClient(model=model_id)
35
+
36
+ class CustomModel(Model):
37
+ def generate(messages, stop_sequences=["Task"]):
38
+ response = client.chat_completion(messages, stop=stop_sequences, max_tokens=1024)
39
+ answer = response.choices[0].message
40
+ return answer
41
+
42
+ custom_model = CustomModel()
43
+ ```
44
+
45
+ Additionally, `generate` can also take a `grammar` argument. In the case where you specify a `grammar` upon agent initialization, this argument will be passed to the calls to model, with the `grammar` that you defined upon initialization, to allow [constrained generation](https://huggingface.co/docs/text-generation-inference/conceptual/guidance) in order to force properly-formatted agent outputs.
46
+
47
+ ### TransformersModel
48
+
49
+ For convenience, we have added a `TransformersModel` that implements the points above by building a local `transformers` pipeline for the model_id given at initialization.
50
+
51
+ ```python
52
+ from smolagents import TransformersModel
53
+
54
+ model = TransformersModel(model_id="HuggingFaceTB/SmolLM-135M-Instruct")
55
+
56
+ print(model([{"role": "user", "content": [{"type": "text", "text": "Ok!"}]}], stop_sequences=["great"]))
57
+ ```
58
+ ```text
59
+ >>> What a
60
+ ```
61
+
62
+ > [!TIP]
63
+ > You must have `transformers` and `torch` installed on your machine. Please run `pip install smolagents[transformers]` if it's not the case.
64
+
65
+ [[autodoc]] TransformersModel
66
+
67
+ ### InferenceClientModel
68
+
69
+ The `InferenceClientModel` wraps huggingface_hub's [InferenceClient](https://huggingface.co/docs/huggingface_hub/main/en/guides/inference) for the execution of the LLM. It supports all [Inference Providers](https://huggingface.co/docs/inference-providers/index) available on the Hub: Cerebras, Cohere, Fal, Fireworks, HF-Inference, Hyperbolic, Nebius, Novita, Replicate, SambaNova, Together, and more.
70
+
71
+ ```python
72
+ from smolagents import InferenceClientModel
73
+
74
+ messages = [
75
+ {"role": "user", "content": [{"type": "text", "text": "Hello, how are you?"}]}
76
+ ]
77
+
78
+ model = InferenceClientModel(provider="novita")
79
+ print(model(messages))
80
+ ```
81
+ ```text
82
+ >>> Of course! If you change your mind, feel free to reach out. Take care!
83
+ ```
84
+ [[autodoc]] InferenceClientModel
85
+
86
+ ### LiteLLMModel
87
+
88
+ The `LiteLLMModel` leverages [LiteLLM](https://www.litellm.ai/) to support 100+ LLMs from various providers.
89
+ You can pass kwargs upon model initialization that will then be used whenever using the model, for instance below we pass `temperature`.
90
+
91
+ ```python
92
+ from smolagents import LiteLLMModel
93
+
94
+ messages = [
95
+ {"role": "user", "content": [{"type": "text", "text": "Hello, how are you?"}]}
96
+ ]
97
+
98
+ model = LiteLLMModel(model_id="anthropic/claude-3-5-sonnet-latest", temperature=0.2, max_tokens=10)
99
+ print(model(messages))
100
+ ```
101
+
102
+ [[autodoc]] LiteLLMModel
103
+
104
+ ### LiteLLMRouterModel
105
+
106
+ The `LiteLLMRouterModel` is a wrapper around the [LiteLLM Router](https://docs.litellm.ai/docs/routing) that leverages
107
+ advanced routing strategies: load-balancing across multiple deployments, prioritizing critical requests via queueing,
108
+ and implementing basic reliability measures such as cooldowns, fallbacks, and exponential backoff retries.
109
+
110
+ ```python
111
+ from smolagents import LiteLLMRouterModel
112
+
113
+ messages = [
114
+ {"role": "user", "content": [{"type": "text", "text": "Hello, how are you?"}]}
115
+ ]
116
+
117
+ model = LiteLLMRouterModel(
118
+ model_id="llama-3.3-70b",
119
+ model_list=[
120
+ {
121
+ "model_name": "llama-3.3-70b",
122
+ "litellm_params": {"model": "groq/llama-3.3-70b", "api_key": os.getenv("GROQ_API_KEY")},
123
+ },
124
+ {
125
+ "model_name": "llama-3.3-70b",
126
+ "litellm_params": {"model": "cerebras/llama-3.3-70b", "api_key": os.getenv("CEREBRAS_API_KEY")},
127
+ },
128
+ ],
129
+ client_kwargs={
130
+ "routing_strategy": "simple-shuffle",
131
+ },
132
+ )
133
+ print(model(messages))
134
+ ```
135
+
136
+ [[autodoc]] LiteLLMRouterModel
137
+
138
+ ### OpenAIServerModel
139
+
140
+ This class lets you call any OpenAIServer compatible model.
141
+ Here's how you can set it (you can customise the `api_base` url to point to another server):
142
+ ```py
143
+ import os
144
+ from smolagents import OpenAIServerModel
145
+
146
+ model = OpenAIServerModel(
147
+ model_id="gpt-4o",
148
+ api_base="https://api.openai.com/v1",
149
+ api_key=os.environ["OPENAI_API_KEY"],
150
+ )
151
+ ```
152
+
153
+ [[autodoc]] OpenAIServerModel
154
+
155
+ ### AzureOpenAIServerModel
156
+
157
+ `AzureOpenAIServerModel` allows you to connect to any Azure OpenAI deployment.
158
+
159
+ Below you can find an example of how to set it up, note that you can omit the `azure_endpoint`, `api_key`, and `api_version` arguments, provided you've set the corresponding environment variables -- `AZURE_OPENAI_ENDPOINT`, `AZURE_OPENAI_API_KEY`, and `OPENAI_API_VERSION`.
160
+
161
+ Pay attention to the lack of an `AZURE_` prefix for `OPENAI_API_VERSION`, this is due to the way the underlying [openai](https://github.com/openai/openai-python) package is designed.
162
+
163
+ ```py
164
+ import os
165
+
166
+ from smolagents import AzureOpenAIServerModel
167
+
168
+ model = AzureOpenAIServerModel(
169
+ model_id = os.environ.get("AZURE_OPENAI_MODEL"),
170
+ azure_endpoint=os.environ.get("AZURE_OPENAI_ENDPOINT"),
171
+ api_key=os.environ.get("AZURE_OPENAI_API_KEY"),
172
+ api_version=os.environ.get("OPENAI_API_VERSION")
173
+ )
174
+ ```
175
+
176
+ [[autodoc]] AzureOpenAIServerModel
177
+
178
+ ### AmazonBedrockServerModel
179
+
180
+ `AmazonBedrockServerModel` helps you connect to Amazon Bedrock and run your agent with any available models.
181
+
182
+ Below is an example setup. This class also offers additional options for customization.
183
+
184
+ ```py
185
+ import os
186
+
187
+ from smolagents import AmazonBedrockServerModel
188
+
189
+ model = AmazonBedrockServerModel(
190
+ model_id = os.environ.get("AMAZON_BEDROCK_MODEL_ID"),
191
+ )
192
+ ```
193
+
194
+ [[autodoc]] AmazonBedrockServerModel
195
+
196
+ ### MLXModel
197
+
198
+
199
+ ```python
200
+ from smolagents import MLXModel
201
+
202
+ model = MLXModel(model_id="HuggingFaceTB/SmolLM-135M-Instruct")
203
+
204
+ print(model([{"role": "user", "content": "Ok!"}], stop_sequences=["great"]))
205
+ ```
206
+ ```text
207
+ >>> What a
208
+ ```
209
+
210
+ > [!TIP]
211
+ > You must have `mlx-lm` installed on your machine. Please run `pip install smolagents[mlx-lm]` if it's not the case.
212
+
213
+ [[autodoc]] MLXModel
214
+
215
+ ### VLLMModel
216
+
217
+ Model to use [vLLM](https://docs.vllm.ai/) for fast LLM inference and serving.
218
+
219
+ ```python
220
+ from smolagents import VLLMModel
221
+
222
+ model = VLLMModel(model_id="HuggingFaceTB/SmolLM-135M-Instruct")
223
+
224
+ print(model([{"role": "user", "content": "Ok!"}], stop_sequences=["great"]))
225
+ ```
226
+
227
+ > [!TIP]
228
+ > You must have `vllm` installed on your machine. Please run `pip install smolagents[vllm]` if it's not the case.
229
+
230
+ [[autodoc]] VLLMModel
docs/source/en/reference/tools.md ADDED
@@ -0,0 +1,100 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Tools
2
+
3
+ <Tip warning={true}>
4
+
5
+ Smolagents is an experimental API which is subject to change at any time. Results returned by the agents
6
+ can vary as the APIs or underlying models are prone to change.
7
+
8
+ </Tip>
9
+
10
+ To learn more about agents and tools make sure to read the [introductory guide](../index). This page
11
+ contains the API docs for the underlying classes.
12
+
13
+ ## Tools
14
+
15
+ ### load_tool
16
+
17
+ [[autodoc]] load_tool
18
+
19
+ ### tool
20
+
21
+ [[autodoc]] tool
22
+
23
+ ### Tool
24
+
25
+ [[autodoc]] Tool
26
+
27
+ ### launch_gradio_demo
28
+
29
+ [[autodoc]] launch_gradio_demo
30
+
31
+ ## Default tools
32
+
33
+ ### PythonInterpreterTool
34
+
35
+ [[autodoc]] PythonInterpreterTool
36
+
37
+ ### FinalAnswerTool
38
+
39
+ [[autodoc]] FinalAnswerTool
40
+
41
+ ### UserInputTool
42
+
43
+ [[autodoc]] UserInputTool
44
+
45
+ ### WebSearchTool
46
+
47
+ [[autodoc]] WebSearchTool
48
+
49
+ ### DuckDuckGoSearchTool
50
+
51
+ [[autodoc]] DuckDuckGoSearchTool
52
+
53
+ ### GoogleSearchTool
54
+
55
+ [[autodoc]] GoogleSearchTool
56
+
57
+ ### VisitWebpageTool
58
+
59
+ [[autodoc]] VisitWebpageTool
60
+
61
+ ### SpeechToTextTool
62
+
63
+ [[autodoc]] SpeechToTextTool
64
+
65
+ ## ToolCollection
66
+
67
+ [[autodoc]] ToolCollection
68
+
69
+ ## MCP Client
70
+
71
+ [[autodoc]] smolagents.mcp_client.MCPClient
72
+
73
+ ## Agent Types
74
+
75
+ Agents can handle any type of object in-between tools; tools, being completely multimodal, can accept and return
76
+ text, image, audio, video, among other types. In order to increase compatibility between tools, as well as to
77
+ correctly render these returns in ipython (jupyter, colab, ipython notebooks, ...), we implement wrapper classes
78
+ around these types.
79
+
80
+ The wrapped objects should continue behaving as initially; a text object should still behave as a string, an image
81
+ object should still behave as a `PIL.Image`.
82
+
83
+ These types have three specific purposes:
84
+
85
+ - Calling `to_raw` on the type should return the underlying object
86
+ - Calling `to_string` on the type should return the object as a string: that can be the string in case of an `AgentText`
87
+ but will be the path of the serialized version of the object in other instances
88
+ - Displaying it in an ipython kernel should display the object correctly
89
+
90
+ ### AgentText
91
+
92
+ [[autodoc]] smolagents.agent_types.AgentText
93
+
94
+ ### AgentImage
95
+
96
+ [[autodoc]] smolagents.agent_types.AgentImage
97
+
98
+ ### AgentAudio
99
+
100
+ [[autodoc]] smolagents.agent_types.AgentAudio