Duibonduil commited on
Commit
cc54e11
·
verified ·
1 Parent(s): 4b1c600

Upload 6 files

Browse files
aworld/agents/README-multi-agent.md ADDED
@@ -0,0 +1,255 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Multi-agent
2
+
3
+ ```python
4
+ from aworld.agents.llm_agent import Agent
5
+ from aworld.config.conf import AgentConfig
6
+ from aworld.core.agent.swarm import Swarm, GraphBuildType
7
+
8
+ agent_conf = AgentConfig(...)
9
+ ```
10
+
11
+ ## Builder
12
+ Builder represents the way topology is constructed, which is related to runtime execution.
13
+ Topology is the definition of structure. For the same topology structure, different builders
14
+ will produce execution processes and different results.
15
+
16
+ ```python
17
+ """
18
+ Topology:
19
+ ┌─────A─────┐
20
+ B | C
21
+ D
22
+ """
23
+ A = Agent(name="A", conf=agent_conf)
24
+ B = Agent(name="B", conf=agent_conf)
25
+ C = Agent(name="C", conf=agent_conf)
26
+ D = Agent(name="D", conf=agent_conf)
27
+ ```
28
+
29
+ ### Workflow
30
+ Workflow is a special topological structure that can be executed deterministically, all nodes in the swarm
31
+ will be executed. And the starting and ending nodes are **unique** and **indispensable**.
32
+
33
+ Define:
34
+ ```python
35
+ # default is workflow
36
+ Swarm((A, B), (A, C), (A, D))
37
+ or
38
+ Swarm(A, [B, C, D])
39
+ ```
40
+ The example means A is the start node, and the merge of B, C, and D is the end node.
41
+
42
+ ### Handoff
43
+ Handoff using pure AI to drive the flow of the entire topology diagram, one agent's decision hands off
44
+ control to another. Agents as tools, depending on the defined pairs of agents.
45
+
46
+ Define:
47
+ ```python
48
+ Swarm((A, B), (A, C), (A, D), build_type=GraphBuildType.HANDOFF)
49
+ or
50
+ HandoffSwarm((A, B), (A, C), (A, D))
51
+ ```
52
+ **NOTE**: Handoff supported tuple of paired agents forms only.
53
+
54
+ ### Team
55
+ Team requires a leadership agent, and other agents follow its command.
56
+ Team is a special case of handoff, which is the leader-follower mode.
57
+
58
+ Define:
59
+ ```python
60
+ Swarm((A, B), (A, C), (A, D), build_type=GraphBuildType.TEAM)
61
+ or
62
+ TeamSwarm(A, B, C, D)
63
+ or
64
+ Swarm(B, C, D, root_agent=A, build_type=GraphBuildType.TEAM)
65
+ ```
66
+ The root_agent or first agent A is the leader; other agents interact with the leader A.
67
+
68
+
69
+ ## Topology
70
+ The topology structure of multi-agent is represented by Swarm, Swarm's topology is built based on
71
+ various single agents,can use the topology type and build type Swarm to represent different structural types.
72
+
73
+ ### Star
74
+ Each agent communicates with a single supervisor agent, also known as star topology,
75
+ a special structure of tree topology, also referred to as a team topology in **Aworld**.
76
+
77
+ A plan agent with other executing agents is a typical example.
78
+ ```python
79
+ """
80
+ Star topology:
81
+ ┌───── plan ───┐
82
+ exec1 exec2
83
+ """
84
+ plan = Agent(name="plan", conf=agent_conf)
85
+ exec1 = Agent(name="exec1", conf=agent_conf)
86
+ exec2 = Agent(name="exec2", conf=agent_conf)
87
+ ```
88
+
89
+ We have two ways to construct this topology structure.
90
+ ```python
91
+ swarm = Swarm((plan, exec1), (plan, exec2))
92
+ ```
93
+ or use handoffs mechanism:
94
+ ```python
95
+ plan = Agent(name="plan", conf=agent_conf, agent_names=['exec1', 'exec2'])
96
+ swarm = Swarm(plan, register_agents=[exec1, exec2])
97
+ ```
98
+ or use team mechanism:
99
+ ```python
100
+ # The order of the plan agent is the first.
101
+ swarm = TeamSwarm(plan, exec1, exec2,
102
+ build_type=GraphBuildType.TEAM)
103
+ ```
104
+
105
+ Note:
106
+ - Whether to execute exec1 or exec2 is decided by LLM.
107
+ - If you want to execute all defined nodes with certainty, you need to use the `workflow` pattern.
108
+ Like this will execute all the defined nodes:
109
+ ```python
110
+ swarm = Swarm(plan, [exec1, exec2])
111
+ ```
112
+ - If it is necessary to execute exec1, whether to execute exec2 depends on LLM, you can define it as:
113
+ ```python
114
+ plan = Agent(name="plan", conf=agent_conf, agent_names=['exec1', 'exec2'])
115
+ swarm = Swarm((plan, exec1), register_agents=[exec2])
116
+ ```
117
+ That means that **GraphBuildType.WORKFLOW** is set, all nodes within the swarm will be executed.
118
+
119
+ ### Tree
120
+ This is a generalization of the star topology and allows for more complex control flows.
121
+
122
+ #### Hierarchical
123
+ ```python
124
+ """
125
+ Hierarchical topology:
126
+ ┌─────────── root ───────────┐
127
+ ┌───── parent1 ───┐ ┌─────── parent2 ───────┐
128
+ leaf1_1 leaf1_2 leaf1_1 leaf2_2
129
+ """
130
+
131
+ root = Agent(name="root", conf=agent_conf)
132
+ parent1 = Agent(name="parent1", conf=agent_conf)
133
+ parent2 = Agent(name="parent2", conf=agent_conf)
134
+ leaf1_1 = Agent(name="leaf1_1", conf=agent_conf)
135
+ leaf1_2 = Agent(name="leaf1_2", conf=agent_conf)
136
+ leaf2_1 = Agent(name="leaf2_1", conf=agent_conf)
137
+ leaf2_2 = Agent(name="leaf2_2", conf=agent_conf)
138
+ ```
139
+
140
+ ```python
141
+ swarm = Swarm((root, parent1), (root, parent2),
142
+ (parent1, leaf1_1), (parent1, leaf1_2),
143
+ (parent2, leaf2_1), (parent2, leaf2_2),
144
+ build_type=GraphBuildType.HANDOFF)
145
+ ```
146
+ or use agent handoff:
147
+ ```python
148
+ root = Agent(name="root", conf=agent_conf, agent_names=['parent1', 'parent2'])
149
+ parent1 = Agent(name="parent1", conf=agent_conf, agent_names=['leaf1_1', 'leaf1_2'])
150
+ parent2 = Agent(name="parent2", conf=agent_conf, agent_names=['leaf2_1', 'leaf2_2'])
151
+
152
+ swarm = HandoffSwarm((root, parent1), (root, parent2),
153
+ register_agents=[leaf1_1, leaf1_2, leaf2_1, leaf2_2])
154
+ ```
155
+
156
+ #### Map-reduce
157
+ If the topology structure becomes further complex:
158
+ ```
159
+ ┌─────────── root ───────────┐
160
+ ┌───── parent1 ───┐ ┌────── parent2 ──────┐
161
+ leaf1_1 leaf1_2 leaf1_1 leaf2_2
162
+ └─────result1─────┘ └───────result2───────┘
163
+ └───────────final───────────┘
164
+ ```
165
+ We define it as **Map-reduce** topology, equivalent to workflow in terms of execution mode.
166
+
167
+ Build in this way:
168
+
169
+ ```python
170
+ result1 = Agent(name="result1", conf=agent_conf)
171
+ result2 = Agent(name="result2", conf=agent_conf)
172
+ final = Agent(name="final", conf=agent_conf)
173
+
174
+ swarm = Swarm(
175
+ (root, [parent1, parent2]),
176
+ (parent1, [leaf1_1, leaf1_2]),
177
+ (parent2, [leaf2_1, leaf2_2]),
178
+ ([leaf1_1, leaf1_2], result1),
179
+ ([leaf2_1, leaf2_2], result2),
180
+ ([result1, result2], final)
181
+ )
182
+ ```
183
+ Assuming there is a cycle final -> root in the topology, define it as:
184
+ ```python
185
+ final = LoopableAgent(name="final",
186
+ conf=agent_conf,
187
+ max_run_times=5,
188
+ loop_point=root.name(),
189
+ stop_func=...)
190
+ ```
191
+ `stop_func` is a function that determines whether to terminate prematurely.
192
+
193
+
194
+ ### Mesh
195
+ Divided into a fully meshed topology and a partially meshed topology.
196
+ Fully meshed topology means that each agent can communicate with every other agent,
197
+ any agent can decide which other agent to call next.
198
+
199
+ ```python
200
+ """
201
+ Fully Meshed topology:
202
+ ┌─────────── A ──────────┐
203
+ B ───────────|────────── C
204
+ └─────────── D ─────────┘
205
+ """
206
+ A = Agent(name="A", conf=agent_conf)
207
+ B = Agent(name="B", conf=agent_conf)
208
+ C = Agent(name="C", conf=agent_conf)
209
+ D = Agent(name="D", conf=agent_conf)
210
+ ```
211
+
212
+ Network topology need to use the `handoffs` mechanism:
213
+ ```python
214
+ swarm = HandoffsSwarm((A, B), (B, A),
215
+ (A, C), (C, A),
216
+ (A, D), (D, A),
217
+ (B, C), (C, B),
218
+ (B, D), (D, B),
219
+ (C, D), (D, C))
220
+ ```
221
+ If a few pairs are removed, it becomes a partially meshed topology.
222
+
223
+ ### Ring
224
+ A ring topology structure is a closed loop formed by nodes.
225
+
226
+ ```python
227
+ """
228
+ Ring topology:
229
+ ┌───────────> A >──────────┐
230
+ B C
231
+ └───────────< D <─────────┘
232
+ """
233
+ A = Agent(name="A", conf=agent_conf)
234
+ B = Agent(name="B", conf=agent_conf)
235
+ C = Agent(name="C", conf=agent_conf)
236
+ D = Agent(name="D", conf=agent_conf)
237
+ ```
238
+
239
+
240
+ ```python
241
+ swarm = Swarm((A, C), (C, D), (D, B), (B, A))
242
+ ```
243
+ **Note:**
244
+ - This defined loop can only be executed once.
245
+ - If you want to execute multiple times, need to define it as:
246
+
247
+ ```python
248
+ B = LoopableAgent(name="B", max_run_times=5, stop_func=...)
249
+ swarm = Swarm((A, C), (C, D), (D, B))
250
+ ```
251
+ ### hybrid
252
+ A generalization of topology, supporting an arbitrary combination of topologies, internally capable of
253
+ loops, parallel, serial dependencies, and groups.
254
+
255
+ ## Execution
aworld/agents/README.md ADDED
@@ -0,0 +1,107 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # AI Agents
2
+
3
+ Intelligent agents that control devices or tools in env using AI models or policy.
4
+
5
+ ![Agent Architecture](../../readme_assets/framework_agent.png)
6
+
7
+ Most of the time, we directly use existing tools to build different types of agents that use LLM,
8
+ using frameworks makes it easy to write various agents.
9
+
10
+ Detailed steps for building an agent:
11
+ 1. Define your `Agent`
12
+ 2. Write prompt used to the agent, also choose not to set it.
13
+ 3. Run it.
14
+
15
+ We provide a complete and simple example for writing an agent and multi-agent:
16
+
17
+ ```python
18
+ from aworld.config.conf import AgentConfig
19
+ from aworld.agents.llm_agent import Agent
20
+
21
+ prompt = """
22
+ Please act as a search agent, constructing appropriate keywords and searach terms, using search toolkit to collect relevant information, including urls, webpage snapshots, etc.
23
+ Here are some tips that help you perform web search:
24
+ - Never add too many keywords in your search query! Some detailed results need to perform browser interaction to get, not using search toolkit.
25
+ - If the question is complex, search results typically do not provide precise answers. It is not likely to find the answer directly using search toolkit only, the search query should be concise and focuses on finding official sources rather than direct answers.
26
+ For example, as for the question "What is the maximum length in meters of #9 in the first National Geographic short on YouTube that was ever released according to the Monterey Bay Aquarium website?", your first search term must be coarse-grained like "National Geographic YouTube" to find the youtube website first, and then try other fine-grained search terms step-by-step to find more urls.
27
+ - The results you return do not have to directly answer the original question, you only need to collect relevant information.
28
+
29
+ Here are the question: {task}
30
+
31
+ Please perform web search and return the listed search result, including urls and necessary webpage snapshots, introductions, etc.
32
+ Your output should be like the followings (at most 3 relevant pages from coa):
33
+ [
34
+ {{
35
+ "url": [URL],
36
+ "information": [INFORMATION OR CONTENT]
37
+ }},
38
+ ...
39
+ ]
40
+ """
41
+
42
+ # Step1
43
+ agent_config = AgentConfig(
44
+ llm_provider="openai",
45
+ llm_model_name="gpt-4o",
46
+ llm_temperature=1,
47
+ # need to set llm_api_key for use LLM
48
+ llm_api_key=""
49
+ )
50
+
51
+ search = Agent(
52
+ conf=agent_config,
53
+ name="search_agent",
54
+ system_prompt="You are a helpful search agent.",
55
+ # used to opt the result, also choose not to set it
56
+ agent_prompt=prompt,
57
+ tool_names=["search_api"]
58
+ )
59
+
60
+ ```
61
+
62
+ It can also quickly develop multi-agent based on the framework.
63
+
64
+ On the basis of the above agent(SearchAgent), we provide a multi-agent example:
65
+
66
+ ```python
67
+ from aworld.agents.llm_agent import Agent
68
+
69
+ summary_prompt = """
70
+ Summarize the following text in one clear and concise paragraph, capturing the key ideas without missing critical points.
71
+ Ensure the summary is easy to understand and avoids excessive detail.
72
+
73
+ Here are the content:
74
+ {task}
75
+ """
76
+
77
+ summary = Agent(
78
+ conf=agent_config,
79
+ name="summary_agent",
80
+ system_prompt="You are a helpful general summary agent.",
81
+ # used to opt the result, also choose not to set it
82
+ agent_prompt=summary_prompt
83
+ )
84
+ ```
85
+
86
+ You can run single-agent or multi-agent through Swarm.
87
+ NOTE: Need to set some environment variables first! Effective GOOGLE_API_KEY, GOOGLE_ENGINE_ID, OPENAI_API_KEY and OPENAI_ENDPOINT.
88
+
89
+ ```python
90
+
91
+ from aworld.core.agent.swarm import Swarm
92
+ from aworld.runner import Runners
93
+
94
+ if __name__ == '__main__':
95
+ task = "search 1+1=?"
96
+ # build topology graph, the correct order is necessary
97
+ swarm = Swarm(search, summary, max_steps=1)
98
+
99
+ prefix = ""
100
+ # can special search google, wiki, duck go, or baidu. such as:
101
+ # prefix = "search wiki: "
102
+ res = Runners.sync_run(
103
+ input=prefix + """What is an agent.""",
104
+ swarm=swarm
105
+ )
106
+ ```
107
+ You can view search example [code](../../examples/search).
aworld/agents/llm_agent.py ADDED
@@ -0,0 +1,1011 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # coding: utf-8
2
+ # Copyright (c) 2025 inclusionAI.
3
+ import abc
4
+ import json
5
+ import time
6
+ import traceback
7
+ import uuid
8
+ from collections import OrderedDict
9
+ from typing import AsyncGenerator, Dict, Any, List, Union, Callable
10
+
11
+ import aworld.trace as trace
12
+ from aworld.config import ToolConfig
13
+ from aworld.config.conf import AgentConfig, ConfigDict, ContextRuleConfig, ModelConfig, OptimizationConfig, \
14
+ LlmCompressionConfig
15
+ from aworld.core.agent.agent_desc import get_agent_desc
16
+ from aworld.core.agent.base import BaseAgent, AgentResult, is_agent_by_name, is_agent
17
+ from aworld.core.common import Observation, ActionModel
18
+ from aworld.core.context.base import AgentContext
19
+ from aworld.core.context.base import Context
20
+ from aworld.core.context.processor.prompt_processor import PromptProcessor
21
+ from aworld.core.event import eventbus
22
+ from aworld.core.event.base import Message, ToolMessage, Constants, AgentMessage
23
+ from aworld.core.tool.base import ToolFactory, AsyncTool, Tool
24
+ from aworld.core.memory import MemoryItem, MemoryConfig
25
+ from aworld.core.tool.tool_desc import get_tool_desc
26
+ from aworld.logs.util import logger, color_log, Color, trace_logger
27
+ from aworld.mcp_client.utils import sandbox_mcp_tool_desc_transform
28
+ from aworld.memory.main import MemoryFactory
29
+ from aworld.models.llm import get_llm_model, call_llm_model, acall_llm_model, acall_llm_model_stream
30
+ from aworld.models.model_response import ModelResponse, ToolCall
31
+ from aworld.models.utils import tool_desc_transform, agent_desc_transform
32
+ from aworld.output import Outputs
33
+ from aworld.output.base import StepOutput, MessageOutput
34
+ from aworld.runners.hook.hook_factory import HookFactory
35
+ from aworld.runners.hook.hooks import HookPoint
36
+ from aworld.utils.common import sync_exec, nest_dict_counter
37
+
38
+
39
+ class Agent(BaseAgent[Observation, List[ActionModel]]):
40
+ """Basic agent for unified protocol within the framework."""
41
+
42
+ def __init__(self,
43
+ conf: Union[Dict[str, Any], ConfigDict, AgentConfig],
44
+ resp_parse_func: Callable[..., Any] = None,
45
+ **kwargs):
46
+ """A api class implementation of agent, using the `Observation` and `List[ActionModel]` protocols.
47
+
48
+ Args:
49
+ conf: Agent config, supported AgentConfig, ConfigDict or dict.
50
+ resp_parse_func: Response parse function for the agent standard output, transform llm response.
51
+ """
52
+ super(Agent, self).__init__(conf, **kwargs)
53
+ conf = self.conf
54
+ self.model_name = conf.llm_config.llm_model_name if conf.llm_config.llm_model_name else conf.llm_model_name
55
+ self._llm = None
56
+ self.memory = MemoryFactory.from_config(MemoryConfig(provider="inmemory"))
57
+ self.system_prompt: str = kwargs.pop("system_prompt") if kwargs.get("system_prompt") else conf.system_prompt
58
+ self.agent_prompt: str = kwargs.get("agent_prompt") if kwargs.get("agent_prompt") else conf.agent_prompt
59
+
60
+ self.event_driven = kwargs.pop('event_driven', conf.get('event_driven', False))
61
+ self.handler: Callable[..., Any] = kwargs.get('handler')
62
+
63
+ self.need_reset = kwargs.get('need_reset') if kwargs.get('need_reset') else conf.need_reset
64
+ # whether to keep contextual information, False means keep, True means reset in every step by the agent call
65
+ self.step_reset = kwargs.get('step_reset') if kwargs.get('step_reset') else True
66
+ # tool_name: [tool_action1, tool_action2, ...]
67
+ self.black_tool_actions: Dict[str, List[str]] = kwargs.get("black_tool_actions") if kwargs.get(
68
+ "black_tool_actions") else conf.get('black_tool_actions', {})
69
+ self.resp_parse_func = resp_parse_func if resp_parse_func else self.response_parse
70
+ self.history_messages = kwargs.get("history_messages") if kwargs.get("history_messages") else 100
71
+ self.use_tools_in_prompt = kwargs.get('use_tools_in_prompt', conf.use_tools_in_prompt)
72
+ self.context_rule = kwargs.get("context_rule") if kwargs.get("context_rule") else conf.context_rule
73
+ self.tools_instances = {}
74
+ self.tools_conf = {}
75
+
76
+ def reset(self, options: Dict[str, Any]):
77
+ super().reset(options)
78
+ self.memory = MemoryFactory.from_config(
79
+ MemoryConfig(provider=options.pop("memory_store") if options.get("memory_store") else "inmemory"))
80
+
81
+ def set_tools_instances(self, tools, tools_conf):
82
+ self.tools_instances = tools
83
+ self.tools_conf = tools_conf
84
+
85
+ @property
86
+ def llm(self):
87
+ # lazy
88
+ if self._llm is None:
89
+ llm_config = self.conf.llm_config or None
90
+ conf = llm_config if llm_config and (
91
+ llm_config.llm_provider or llm_config.llm_base_url or llm_config.llm_api_key or llm_config.llm_model_name) else self.conf
92
+ self._llm = get_llm_model(conf)
93
+ return self._llm
94
+
95
+ def _env_tool(self):
96
+ """Description of agent as tool."""
97
+ return tool_desc_transform(get_tool_desc(),
98
+ tools=self.tool_names if self.tool_names else [],
99
+ black_tool_actions=self.black_tool_actions)
100
+
101
+ def _handoffs_agent_as_tool(self):
102
+ """Description of agent as tool."""
103
+ return agent_desc_transform(get_agent_desc(),
104
+ agents=self.handoffs if self.handoffs else [])
105
+
106
+ def _mcp_is_tool(self):
107
+ """Description of mcp servers are tools."""
108
+ try:
109
+ return sync_exec(sandbox_mcp_tool_desc_transform, self.mcp_servers, self.mcp_config)
110
+ except Exception as e:
111
+ logger.error(f"mcp_is_tool error: {traceback.format_exc()}")
112
+ return []
113
+
114
+ def desc_transform(self):
115
+ """Transform of descriptions of supported tools, agents, and MCP servers in the framework to support function calls of LLM."""
116
+
117
+ # Stateless tool
118
+ self.tools = self._env_tool()
119
+ # Agents as tool
120
+ self.tools.extend(self._handoffs_agent_as_tool())
121
+ # MCP servers are tools
122
+ self.tools.extend(self._mcp_is_tool())
123
+ # load to context
124
+ self.agent_context.set_tools(self.tools)
125
+ return self.tools
126
+
127
+ async def async_desc_transform(self):
128
+ """Transform of descriptions of supported tools, agents, and MCP servers in the framework to support function calls of LLM."""
129
+
130
+ # Stateless tool
131
+ self.tools = self._env_tool()
132
+ # Agents as tool
133
+ self.tools.extend(self._handoffs_agent_as_tool())
134
+ # MCP servers are tools
135
+ # todo sandbox
136
+ if self.sandbox:
137
+ sand_box = self.sandbox
138
+ mcp_tools = await sand_box.mcpservers.list_tools()
139
+ self.tools.extend(mcp_tools)
140
+ else:
141
+ self.tools.extend(await sandbox_mcp_tool_desc_transform(self.mcp_servers, self.mcp_config))
142
+ # load to agent context
143
+ self.agent_context.set_tools(self.tools)
144
+
145
+ def _messages_transform(
146
+ self,
147
+ observation: Observation,
148
+ ):
149
+ agent_prompt = self.agent_context.agent_prompt
150
+ sys_prompt = self.agent_context.sys_prompt
151
+ messages = []
152
+ if sys_prompt:
153
+ messages.append(
154
+ {'role': 'system', 'content': sys_prompt if not self.use_tools_in_prompt else sys_prompt.format(
155
+ tool_list=self.tools)})
156
+
157
+ content = observation.content
158
+ if agent_prompt and '{task}' in agent_prompt:
159
+ content = agent_prompt.format(task=observation.content)
160
+
161
+ cur_msg = {'role': 'user', 'content': content}
162
+ # query from memory,
163
+ # histories = self.memory.get_last_n(self.history_messages, filter={"session_id": self.context.session_id})
164
+ histories = self.memory.get_last_n(self.history_messages)
165
+ messages.extend(histories)
166
+
167
+ action_results = observation.action_result
168
+ if action_results:
169
+ for action_result in action_results:
170
+ cur_msg['role'] = 'tool'
171
+ cur_msg['tool_call_id'] = action_result.tool_id
172
+
173
+ agent_info = self.context.context_info.get(self.id())
174
+ if (self.use_tools_in_prompt and "is_use_tool_prompt" in agent_info and "tool_calls"
175
+ in agent_info and agent_prompt):
176
+ cur_msg['content'] = agent_prompt.format(action_list=agent_info["tool_calls"],
177
+ result=content)
178
+
179
+ if observation.images:
180
+ urls = [{'type': 'text', 'text': content}]
181
+ for image_url in observation.images:
182
+ urls.append({'type': 'image_url', 'image_url': {"url": image_url}})
183
+
184
+ cur_msg['content'] = urls
185
+ messages.append(cur_msg)
186
+
187
+ # truncate and other process
188
+ try:
189
+ messages = self._process_messages(messages=messages, agent_context=self.agent_context, context=self.context)
190
+ except Exception as e:
191
+ logger.warning(f"Failed to process messages in _messages_transform: {e}")
192
+ logger.debug(f"Process messages error details: {traceback.format_exc()}")
193
+ self.agent_context.update_messages(messages)
194
+ return messages
195
+
196
+ def messages_transform(self,
197
+ content: str,
198
+ image_urls: List[str] = None,
199
+ **kwargs):
200
+ """Transform the original content to LLM messages of native format.
201
+
202
+ Args:
203
+ content: User content.
204
+ image_urls: List of images encoded using base64.
205
+ sys_prompt: Agent system prompt.
206
+ max_step: The maximum list length obtained from memory.
207
+ Returns:
208
+ Message list for LLM.
209
+ """
210
+ sys_prompt = self.agent_context.system_prompt
211
+ agent_prompt = self.agent_context.agent_prompt
212
+ messages = []
213
+ if sys_prompt:
214
+ messages.append(
215
+ {'role': 'system', 'content': sys_prompt if not self.use_tools_in_prompt else sys_prompt.format(
216
+ tool_list=self.tools)})
217
+
218
+ histories = self.memory.get_last_n(self.history_messages)
219
+ user_content = content
220
+ if not histories and agent_prompt and '{task}' in agent_prompt:
221
+ user_content = agent_prompt.format(task=content)
222
+
223
+ cur_msg = {'role': 'user', 'content': user_content}
224
+ # query from memory,
225
+ # histories = self.memory.get_last_n(self.history_messages, filter={"session_id": self.context.session_id})
226
+
227
+ if histories:
228
+ # default use the first tool call
229
+ for history in histories:
230
+ if not self.use_tools_in_prompt and "tool_calls" in history.metadata and history.metadata['tool_calls']:
231
+ messages.append({'role': history.metadata['role'], 'content': history.content,
232
+ 'tool_calls': [history.metadata["tool_calls"][0]]})
233
+ else:
234
+ messages.append({'role': history.metadata['role'], 'content': history.content,
235
+ "tool_call_id": history.metadata.get("tool_call_id")})
236
+
237
+ if not self.use_tools_in_prompt and "tool_calls" in histories[-1].metadata and histories[-1].metadata[
238
+ 'tool_calls']:
239
+ tool_id = histories[-1].metadata["tool_calls"][0].id
240
+ if tool_id:
241
+ cur_msg['role'] = 'tool'
242
+ cur_msg['tool_call_id'] = tool_id
243
+ if self.use_tools_in_prompt and "is_use_tool_prompt" in histories[-1].metadata and "tool_calls" in \
244
+ histories[-1].metadata and agent_prompt:
245
+ cur_msg['content'] = agent_prompt.format(action_list=histories[-1].metadata["tool_calls"],
246
+ result=content)
247
+
248
+ if image_urls:
249
+ urls = [{'type': 'text', 'text': content}]
250
+ for image_url in image_urls:
251
+ urls.append({'type': 'image_url', 'image_url': {"url": image_url}})
252
+
253
+ cur_msg['content'] = urls
254
+ messages.append(cur_msg)
255
+
256
+ # truncate and other process
257
+ try:
258
+ messages = self._process_messages(messages=messages, agent_context=self.agent_context, context=self.context)
259
+ except Exception as e:
260
+ logger.warning(f"Failed to process messages in messages_transform: {e}")
261
+ logger.debug(f"Process messages error details: {traceback.format_exc()}")
262
+ self.agent_context.set_messages(messages)
263
+ return messages
264
+
265
+ def use_tool_list(self, resp: ModelResponse) -> List[Dict[str, Any]]:
266
+ tool_list = []
267
+ try:
268
+ if resp and hasattr(resp, 'content') and resp.content:
269
+ content = resp.content.strip()
270
+ else:
271
+ return tool_list
272
+ content = content.replace('\n', '').replace('\r', '')
273
+ response_json = json.loads(content)
274
+ if "use_tool_list" in response_json:
275
+ use_tool_list = response_json["use_tool_list"]
276
+ if use_tool_list:
277
+ for use_tool in use_tool_list:
278
+ tool_name = use_tool["tool"]
279
+ arguments = use_tool["arguments"]
280
+ if tool_name and arguments:
281
+ tool_list.append(use_tool)
282
+
283
+ return tool_list
284
+ except Exception as e:
285
+ logger.debug(f"tool_parse error, content: {resp.content}, \nerror msg: {traceback.format_exc()}")
286
+ return tool_list
287
+
288
+ def response_parse(self, resp: ModelResponse) -> AgentResult:
289
+ """Default parse response by LLM."""
290
+ results = []
291
+ if not resp:
292
+ logger.warning("LLM no valid response!")
293
+ return AgentResult(actions=[], current_state=None)
294
+
295
+ use_tool_list = self.use_tool_list(resp)
296
+ is_call_tool = False
297
+ content = '' if resp.content is None else resp.content
298
+ if resp.tool_calls:
299
+ is_call_tool = True
300
+ for tool_call in resp.tool_calls:
301
+ full_name: str = tool_call.function.name
302
+ if not full_name:
303
+ logger.warning("tool call response no tool name.")
304
+ continue
305
+ try:
306
+ params = json.loads(tool_call.function.arguments)
307
+ except:
308
+ logger.warning(f"{tool_call.function.arguments} parse to json fail.")
309
+ params = {}
310
+ # format in framework
311
+ names = full_name.split("__")
312
+ tool_name = names[0]
313
+ if is_agent_by_name(tool_name):
314
+ param_info = params.get('content', "") + ' ' + params.get('info', '')
315
+ results.append(ActionModel(tool_name=tool_name,
316
+ tool_id=tool_call.id,
317
+ agent_name=self.id(),
318
+ params=params,
319
+ policy_info=content + param_info))
320
+ else:
321
+ action_name = '__'.join(names[1:]) if len(names) > 1 else ''
322
+ results.append(ActionModel(tool_name=tool_name,
323
+ tool_id=tool_call.id,
324
+ action_name=action_name,
325
+ agent_name=self.id(),
326
+ params=params,
327
+ policy_info=content))
328
+ elif use_tool_list and len(use_tool_list) > 0:
329
+ is_call_tool = True
330
+ for use_tool in use_tool_list:
331
+ full_name = use_tool["tool"]
332
+ if not full_name:
333
+ logger.warning("tool call response no tool name.")
334
+ continue
335
+ params = use_tool["arguments"]
336
+ if not params:
337
+ logger.warning("tool call response no tool params.")
338
+ continue
339
+ names = full_name.split("__")
340
+ tool_name = names[0]
341
+ if is_agent_by_name(tool_name):
342
+ param_info = params.get('content', "") + ' ' + params.get('info', '')
343
+ results.append(ActionModel(tool_name=tool_name,
344
+ tool_id=use_tool.get('id'),
345
+ agent_name=self.id(),
346
+ params=params,
347
+ policy_info=content + param_info))
348
+ else:
349
+ action_name = '__'.join(names[1:]) if len(names) > 1 else ''
350
+ results.append(ActionModel(tool_name=tool_name,
351
+ tool_id=use_tool.get('id'),
352
+ action_name=action_name,
353
+ agent_name=self.id(),
354
+ params=params,
355
+ policy_info=content))
356
+ else:
357
+ if content:
358
+ content = content.replace("```json", "").replace("```", "")
359
+ # no tool call, agent name is itself.
360
+ results.append(ActionModel(agent_name=self.id(), policy_info=content))
361
+ return AgentResult(actions=results, current_state=None, is_call_tool=is_call_tool)
362
+
363
+ def _log_messages(self, messages: List[Dict[str, Any]]) -> None:
364
+ """Log the sequence of messages for debugging purposes"""
365
+ logger.info(f"[agent] Invoking LLM with {len(messages)} messages:")
366
+ for i, msg in enumerate(messages):
367
+ prefix = msg.get('role')
368
+ logger.info(f"[agent] Message {i + 1}: {prefix} ===================================")
369
+ if isinstance(msg['content'], list):
370
+ for item in msg['content']:
371
+ if item.get('type') == 'text':
372
+ logger.info(f"[agent] Text content: {item.get('text')}")
373
+ elif item.get('type') == 'image_url':
374
+ image_url = item.get('image_url', {}).get('url', '')
375
+ if image_url.startswith('data:image'):
376
+ logger.info(f"[agent] Image: [Base64 image data]")
377
+ else:
378
+ logger.info(f"[agent] Image URL: {image_url[:30]}...")
379
+ else:
380
+ content = str(msg['content'])
381
+ chunk_size = 500
382
+ for j in range(0, len(content), chunk_size):
383
+ chunk = content[j:j + chunk_size]
384
+ if j == 0:
385
+ logger.info(f"[agent] Content: {chunk}")
386
+ else:
387
+ logger.info(f"[agent] Content (continued): {chunk}")
388
+
389
+ if 'tool_calls' in msg and msg['tool_calls']:
390
+ for tool_call in msg.get('tool_calls'):
391
+ if isinstance(tool_call, dict):
392
+ logger.info(f"[agent] Tool call: {tool_call.get('name')} - ID: {tool_call.get('id')}")
393
+ args = str(tool_call.get('args', {}))[:1000]
394
+ logger.info(f"[agent] Tool args: {args}...")
395
+ elif isinstance(tool_call, ToolCall):
396
+ logger.info(f"[agent] Tool call: {tool_call.function.name} - ID: {tool_call.id}")
397
+ args = str(tool_call.function.arguments)[:1000]
398
+ logger.info(f"[agent] Tool args: {args}...")
399
+
400
+ def _agent_result(self, actions: List[ActionModel], caller: str):
401
+ if not actions:
402
+ raise Exception(f'{self.id()} no action decision has been made.')
403
+
404
+ tools = OrderedDict()
405
+ agents = []
406
+ for action in actions:
407
+ if is_agent(action):
408
+ agents.append(action)
409
+ else:
410
+ if action.tool_name not in tools:
411
+ tools[action.tool_name] = []
412
+ tools[action.tool_name].append(action)
413
+
414
+ _group_name = None
415
+ # agents and tools exist simultaneously, more than one agent/tool name
416
+ if (agents and tools) or len(agents) > 1 or len(tools) > 1:
417
+ _group_name = f"{self.id()}_{uuid.uuid1().hex}"
418
+
419
+ # complex processing
420
+ if _group_name:
421
+ logger.warning(f"more than one agent an tool causing confusion, will choose the first one. {agents}")
422
+ agents = [agents[0]] if agents else []
423
+ for _, v in tools.items():
424
+ actions = v
425
+ break
426
+
427
+ if agents:
428
+ return AgentMessage(payload=actions,
429
+ caller=caller,
430
+ sender=self.id(),
431
+ receiver=actions[0].tool_name,
432
+ session_id=self.context.session_id if self.context else "",
433
+ headers={"context": self.context})
434
+ else:
435
+ return ToolMessage(payload=actions,
436
+ caller=caller,
437
+ sender=self.id(),
438
+ receiver=actions[0].tool_name,
439
+ session_id=self.context.session_id if self.context else "",
440
+ headers={"context": self.context})
441
+
442
+ def post_run(self, policy_result: List[ActionModel], policy_input: Observation) -> Message:
443
+ return self._agent_result(
444
+ policy_result,
445
+ policy_input.from_agent_name if policy_input.from_agent_name else policy_input.observer
446
+ )
447
+
448
+ async def async_post_run(self, policy_result: List[ActionModel], policy_input: Observation) -> Message:
449
+ return self._agent_result(
450
+ policy_result,
451
+ policy_input.from_agent_name if policy_input.from_agent_name else policy_input.observer
452
+ )
453
+
454
+ def policy(self, observation: Observation, info: Dict[str, Any] = {}, **kwargs) -> List[ActionModel]:
455
+ """The strategy of an agent can be to decide which tools to use in the environment, or to delegate tasks to other agents.
456
+
457
+ Args:
458
+ observation: The state observed from tools in the environment.
459
+ info: Extended information is used to assist the agent to decide a policy.
460
+
461
+ Returns:
462
+ ActionModel sequence from agent policy
463
+ """
464
+ output = None
465
+ if kwargs.get("output") and isinstance(kwargs.get("output"), StepOutput):
466
+ output = kwargs["output"]
467
+
468
+ # Get current step information for trace recording
469
+ step = kwargs.get("step", 0)
470
+ exp_id = kwargs.get("exp_id", None)
471
+ source_span = trace.get_current_span()
472
+
473
+ if hasattr(observation, 'context') and observation.context:
474
+ self.task_histories = observation.context
475
+
476
+ try:
477
+ self._run_hooks_sync(self.context, HookPoint.PRE_LLM_CALL)
478
+ except Exception as e:
479
+ logger.warn(traceback.format_exc())
480
+
481
+ self._finished = False
482
+ self.desc_transform()
483
+ images = observation.images if self.conf.use_vision else None
484
+ if self.conf.use_vision and not images and observation.image:
485
+ images = [observation.image]
486
+ observation.images = images
487
+ messages = self.messages_transform(content=observation.content,
488
+ image_urls=observation.images)
489
+
490
+ self._log_messages(messages)
491
+ self.memory.add(MemoryItem(
492
+ content=messages[-1]['content'],
493
+ metadata={
494
+ "role": messages[-1]['role'],
495
+ "agent_name": self.id(),
496
+ "tool_call_id": messages[-1].get("tool_call_id")
497
+ }
498
+ ))
499
+
500
+ llm_response = None
501
+ span_name = f"llm_call_{exp_id}"
502
+ serializable_messages = self._to_serializable(messages)
503
+ with trace.span(span_name) as llm_span:
504
+ llm_span.set_attributes({
505
+ "exp_id": exp_id,
506
+ "step": step,
507
+ "messages": json.dumps(serializable_messages, ensure_ascii=False)
508
+ })
509
+ if source_span:
510
+ source_span.set_attribute("messages", json.dumps(serializable_messages, ensure_ascii=False))
511
+
512
+ try:
513
+ llm_response = call_llm_model(
514
+ self.llm,
515
+ messages=messages,
516
+ model=self.model_name,
517
+ temperature=self.conf.llm_config.llm_temperature,
518
+ tools=self.tools if not self.use_tools_in_prompt and self.tools else None
519
+ )
520
+
521
+ logger.info(f"Execute response: {llm_response.message}")
522
+ except Exception as e:
523
+ logger.warn(traceback.format_exc())
524
+ raise e
525
+ finally:
526
+ if llm_response:
527
+ # update usage
528
+ self.update_context_usage(used_context_length=llm_response.usage['total_tokens'])
529
+ # update current step output
530
+ self.update_llm_output(llm_response)
531
+
532
+ use_tools = self.use_tool_list(llm_response)
533
+ is_use_tool_prompt = len(use_tools) > 0
534
+ if llm_response.error:
535
+ logger.info(f"llm result error: {llm_response.error}")
536
+ else:
537
+ info = {
538
+ "role": "assistant",
539
+ "agent_name": self.id(),
540
+ "tool_calls": llm_response.tool_calls if not self.use_tools_in_prompt else use_tools,
541
+ "is_use_tool_prompt": is_use_tool_prompt if not self.use_tools_in_prompt else False
542
+ }
543
+ self.memory.add(MemoryItem(
544
+ content=llm_response.content,
545
+ metadata=info
546
+ ))
547
+ # rewrite
548
+ self.context.context_info[self.id()] = info
549
+ else:
550
+ logger.error(f"{self.id()} failed to get LLM response")
551
+ raise RuntimeError(f"{self.id()} failed to get LLM response")
552
+
553
+ try:
554
+ self._run_hooks_sync(self.context, HookPoint.POST_LLM_CALL)
555
+ except Exception as e:
556
+ logger.warn(traceback.format_exc())
557
+
558
+ agent_result = sync_exec(self.resp_parse_func, llm_response)
559
+ if not agent_result.is_call_tool:
560
+ self._finished = True
561
+
562
+ if output:
563
+ output.add_part(MessageOutput(source=llm_response, json_parse=False))
564
+ output.mark_finished()
565
+ return agent_result.actions
566
+
567
+ async def async_policy(self, observation: Observation, info: Dict[str, Any] = {}, **kwargs) -> List[ActionModel]:
568
+ """The strategy of an agent can be to decide which tools to use in the environment, or to delegate tasks to other agents.
569
+
570
+ Args:
571
+ observation: The state observed from tools in the environment.
572
+ info: Extended information is used to assist the agent to decide a policy.
573
+
574
+ Returns:
575
+ ActionModel sequence from agent policy
576
+ """
577
+ outputs = None
578
+ if kwargs.get("outputs") and isinstance(kwargs.get("outputs"), Outputs):
579
+ outputs = kwargs.get("outputs")
580
+
581
+ # Get current step information for trace recording
582
+ source_span = trace.get_current_span()
583
+
584
+ if hasattr(observation, 'context') and observation.context:
585
+ self.task_histories = observation.context
586
+
587
+ try:
588
+ events = []
589
+ async for event in self.run_hooks(self.context, HookPoint.PRE_LLM_CALL):
590
+ events.append(event)
591
+ except Exception as e:
592
+ logger.warn(traceback.format_exc())
593
+
594
+ self._finished = False
595
+ messages = await self._prepare_llm_input(observation, info, **kwargs)
596
+
597
+ serializable_messages = self._to_serializable(messages)
598
+ llm_response = None
599
+ if source_span:
600
+ source_span.set_attribute("messages", json.dumps(serializable_messages, ensure_ascii=False))
601
+ try:
602
+ llm_response = await self._call_llm_model(observation, messages, info, **kwargs)
603
+ except Exception as e:
604
+ logger.warn(traceback.format_exc())
605
+ raise e
606
+ finally:
607
+ if llm_response:
608
+ # update usage
609
+ self.update_context_usage(used_context_length=llm_response.usage['total_tokens'])
610
+ # update current step output
611
+ self.update_llm_output(llm_response)
612
+
613
+ use_tools = self.use_tool_list(llm_response)
614
+ is_use_tool_prompt = len(use_tools) > 0
615
+ if llm_response.error:
616
+ logger.info(f"llm result error: {llm_response.error}")
617
+ else:
618
+ self.memory.add(MemoryItem(
619
+ content=llm_response.content,
620
+ metadata={
621
+ "role": "assistant",
622
+ "agent_name": self.id(),
623
+ "tool_calls": llm_response.tool_calls if not self.use_tools_in_prompt else use_tools,
624
+ "is_use_tool_prompt": is_use_tool_prompt if not self.use_tools_in_prompt else False
625
+ }
626
+ ))
627
+ else:
628
+ logger.error(f"{self.id()} failed to get LLM response")
629
+ raise RuntimeError(f"{self.id()} failed to get LLM response")
630
+
631
+ try:
632
+ events = []
633
+ async for event in self.run_hooks(self.context, HookPoint.POST_LLM_CALL):
634
+ events.append(event)
635
+ except Exception as e:
636
+ logger.warn(traceback.format_exc())
637
+
638
+ agent_result = sync_exec(self.resp_parse_func, llm_response)
639
+ if not agent_result.is_call_tool:
640
+ self._finished = True
641
+ return agent_result.actions
642
+
643
+ def _to_serializable(self, obj):
644
+ if isinstance(obj, dict):
645
+ return {k: self._to_serializable(v) for k, v in obj.items()}
646
+ elif isinstance(obj, list):
647
+ return [self._to_serializable(i) for i in obj]
648
+ elif hasattr(obj, "to_dict"):
649
+ return obj.to_dict()
650
+ elif hasattr(obj, "model_dump"):
651
+ return obj.model_dump()
652
+ elif hasattr(obj, "dict"):
653
+ return obj.dict()
654
+ else:
655
+ return obj
656
+
657
+ async def llm_and_tool_execution(self, observation: Observation, messages: List[Dict[str, str]] = [],
658
+ info: Dict[str, Any] = {}, **kwargs) -> List[ActionModel]:
659
+ """Perform combined LLM call and tool execution operations.
660
+
661
+ Args:
662
+ observation: The state observed from the environment
663
+ info: Extended information to assist the agent in decision-making
664
+ **kwargs: Other parameters
665
+
666
+ Returns:
667
+ ActionModel sequence. If a tool is executed, includes the tool execution result.
668
+ """
669
+ # Get current step information for trace recording
670
+ llm_response = await self._call_llm_model(observation, messages, info, **kwargs)
671
+ if llm_response:
672
+ use_tools = self.use_tool_list(llm_response)
673
+ is_use_tool_prompt = len(use_tools) > 0
674
+ if llm_response.error:
675
+ logger.info(f"llm result error: {llm_response.error}")
676
+ else:
677
+ self.memory.add(MemoryItem(
678
+ content=llm_response.content,
679
+ metadata={
680
+ "role": "assistant",
681
+ "agent_name": self.id(),
682
+ "tool_calls": llm_response.tool_calls if not self.use_tools_in_prompt else use_tools,
683
+ "is_use_tool_prompt": is_use_tool_prompt if not self.use_tools_in_prompt else False
684
+ }
685
+ ))
686
+ else:
687
+ logger.error(f"{self.id()} failed to get LLM response")
688
+ raise RuntimeError(f"{self.id()} failed to get LLM response")
689
+
690
+ agent_result = sync_exec(self.resp_parse_func, llm_response)
691
+ if not agent_result.is_call_tool:
692
+ self._finished = True
693
+ return agent_result.actions
694
+ else:
695
+ result = await self._execute_tool(agent_result.actions)
696
+ return result
697
+
698
+ async def _prepare_llm_input(self, observation: Observation, info: Dict[str, Any] = {}, **kwargs):
699
+ """Prepare LLM input
700
+ Args:
701
+ observation: The state observed from the environment
702
+ info: Extended information to assist the agent in decision-making
703
+ **kwargs: Other parameters
704
+ """
705
+ await self.async_desc_transform()
706
+ images = observation.images if self.conf.use_vision else None
707
+ if self.conf.use_vision and not images and observation.image:
708
+ images = [observation.image]
709
+ messages = self.messages_transform(content=observation.content,
710
+ image_urls=images)
711
+
712
+ self._log_messages(messages)
713
+ self.memory.add(MemoryItem(
714
+ content=messages[-1]['content'],
715
+ metadata={
716
+ "role": messages[-1]['role'],
717
+ "agent_name": self.id(),
718
+ "tool_call_id": messages[-1].get("tool_call_id")
719
+ }
720
+ ))
721
+
722
+ return messages
723
+
724
+ def _process_messages(self, messages: List[Dict[str, Any]], agent_context: AgentContext = None,
725
+ context: Context = None) -> Message:
726
+ origin_messages = messages
727
+ st = time.time()
728
+ with trace.span(f"llm_context_process", attributes={
729
+ "start_time": st
730
+ }) as compress_span:
731
+ if agent_context.context_rule is None:
732
+ logger.debug('debug|skip process_messages context_rule is None')
733
+ return messages
734
+ origin_len = compressed_len = len(str(messages))
735
+ origin_messages_count = truncated_messages_count = len(messages)
736
+ try:
737
+ prompt_processor = PromptProcessor(agent_context)
738
+ result = prompt_processor.process_messages(messages, context)
739
+ messages = result.processed_messages
740
+
741
+ compressed_len = len(str(messages))
742
+ truncated_messages_count = len(messages)
743
+ logger.debug(
744
+ f'debug|llm_context_process|{origin_len}|{compressed_len}|{origin_messages_count}|{truncated_messages_count}|\n|{origin_messages}\n|{messages}')
745
+ return messages
746
+ finally:
747
+ compress_span.set_attributes({
748
+ "end_time": time.time(),
749
+ "duration": time.time() - st,
750
+ # messages length
751
+ "origin_messages_count": origin_messages_count,
752
+ "truncated_messages_count": truncated_messages_count,
753
+ "truncated_ratio": round(truncated_messages_count / origin_messages_count, 2),
754
+ # token length
755
+ "origin_len": origin_len,
756
+ "compressed_len": compressed_len,
757
+ "compress_ratio": round(compressed_len / origin_len, 2)
758
+ })
759
+
760
+ async def _call_llm_model(self, observation: Observation, messages: List[Dict[str, str]] = [],
761
+ info: Dict[str, Any] = {}, **kwargs) -> ModelResponse:
762
+ """Perform LLM call
763
+ Args:
764
+ observation: The state observed from the environment
765
+ info: Extended information to assist the agent in decision-making
766
+ **kwargs: Other parameters
767
+ Returns:
768
+ LLM response
769
+ """
770
+ outputs = None
771
+ if kwargs.get("outputs") and isinstance(kwargs.get("outputs"), Outputs):
772
+ outputs = kwargs.get("outputs")
773
+ if not messages:
774
+ messages = await self._prepare_llm_input(observation, self.agent_context, **kwargs)
775
+
776
+ llm_response = None
777
+ source_span = trace.get_current_span()
778
+ serializable_messages = self._to_serializable(messages)
779
+
780
+ if source_span:
781
+ source_span.set_attribute("messages", json.dumps(serializable_messages, ensure_ascii=False))
782
+
783
+ try:
784
+ stream_mode = kwargs.get("stream", False)
785
+ if stream_mode:
786
+ llm_response = ModelResponse(id="", model="", content="", tool_calls=[])
787
+ resp_stream = acall_llm_model_stream(
788
+ self.llm,
789
+ messages=messages,
790
+ model=self.model_name,
791
+ temperature=self.conf.llm_config.llm_temperature,
792
+ tools=self.tools if not self.use_tools_in_prompt and self.tools else None,
793
+ stream=True
794
+ )
795
+
796
+ async def async_call_llm(resp_stream, json_parse=False):
797
+ llm_resp = ModelResponse(id="", model="", content="", tool_calls=[])
798
+
799
+ # Async streaming with acall_llm_model
800
+ async def async_generator():
801
+ async for chunk in resp_stream:
802
+ if chunk.content:
803
+ llm_resp.content += chunk.content
804
+ yield chunk.content
805
+ if chunk.tool_calls:
806
+ llm_resp.tool_calls.extend(chunk.tool_calls)
807
+ if chunk.error:
808
+ llm_resp.error = chunk.error
809
+ llm_resp.id = chunk.id
810
+ llm_resp.model = chunk.model
811
+ llm_resp.usage = nest_dict_counter(llm_resp.usage, chunk.usage)
812
+
813
+ return MessageOutput(source=async_generator(), json_parse=json_parse), llm_resp
814
+
815
+ output, response = await async_call_llm(resp_stream)
816
+ llm_response = response
817
+
818
+ if eventbus is not None and resp_stream:
819
+ output_message = Message(
820
+ category=Constants.OUTPUT,
821
+ payload=output,
822
+ sender=self.id(),
823
+ session_id=self.context.session_id if self.context else "",
824
+ headers={"context": self.context}
825
+ )
826
+ await eventbus.publish(output_message)
827
+ elif not self.event_driven and outputs:
828
+ outputs.add_output(output)
829
+
830
+ else:
831
+ llm_response = await acall_llm_model(
832
+ self.llm,
833
+ messages=messages,
834
+ model=self.model_name,
835
+ temperature=self.conf.llm_config.llm_temperature,
836
+ tools=self.tools if not self.use_tools_in_prompt and self.tools else None,
837
+ stream=kwargs.get("stream", False)
838
+ )
839
+ if eventbus is None:
840
+ logger.warn("=============== eventbus is none ============")
841
+ if eventbus is not None and llm_response:
842
+ await eventbus.publish(Message(
843
+ category=Constants.OUTPUT,
844
+ payload=llm_response,
845
+ sender=self.id(),
846
+ session_id=self.context.session_id if self.context else "",
847
+ headers={"context": self.context}
848
+ ))
849
+ elif not self.event_driven and outputs:
850
+ outputs.add_output(MessageOutput(source=llm_response, json_parse=False))
851
+
852
+ logger.info(f"Execute response: {json.dumps(llm_response.to_dict(), ensure_ascii=False)}")
853
+
854
+
855
+ except Exception as e:
856
+ logger.warn(traceback.format_exc())
857
+ raise e
858
+ finally:
859
+ return llm_response
860
+
861
+ async def _execute_tool(self, actions: List[ActionModel]) -> Any:
862
+ """Execute tool calls
863
+
864
+ Args:
865
+ action: The action(s) to execute
866
+
867
+ Returns:
868
+ The result of tool execution
869
+ """
870
+ tool_actions = []
871
+ for act in actions:
872
+ if is_agent(act):
873
+ continue
874
+ else:
875
+ tool_actions.append(act)
876
+
877
+ msg = None
878
+ terminated = False
879
+ # group action by tool name
880
+ tool_mapping = dict()
881
+ reward = 0.0
882
+ # Directly use or use tools after creation.
883
+ for act in tool_actions:
884
+ if not self.tools_instances or (self.tools_instances and act.tool_name not in self.tools):
885
+ # Dynamically only use default config in module.
886
+ conf = self.tools_conf.get(act.tool_name)
887
+ if not conf:
888
+ conf = ToolConfig(exit_on_failure=self.task.conf.get('exit_on_failure'))
889
+ tool = ToolFactory(act.tool_name, conf=conf, asyn=conf.use_async if conf else False)
890
+ if isinstance(tool, Tool):
891
+ tool.reset()
892
+ elif isinstance(tool, AsyncTool):
893
+ await tool.reset()
894
+ tool_mapping[act.tool_name] = []
895
+ self.tools_instances[act.tool_name] = tool
896
+ if act.tool_name not in tool_mapping:
897
+ tool_mapping[act.tool_name] = []
898
+ tool_mapping[act.tool_name].append(act)
899
+
900
+ observation = None
901
+
902
+ for tool_name, action in tool_mapping.items():
903
+ # Execute action using browser tool and unpack all return values
904
+ if isinstance(self.tools_instances[tool_name], Tool):
905
+ message = self.tools_instances[tool_name].step(action)
906
+ elif isinstance(self.tools_instances[tool_name], AsyncTool):
907
+ # todo sandbox
908
+ message = await self.tools_instances[tool_name].step(action, agent=self)
909
+ else:
910
+ logger.warning(f"Unsupported tool type: {self.tools_instances[tool_name]}")
911
+ continue
912
+
913
+ observation, reward, terminated, _, info = message.payload
914
+
915
+ # Check if there's an exception in info
916
+ if info.get("exception"):
917
+ color_log(f"Agent {self.id()} _execute_tool failed with exception: {info['exception']}",
918
+ color=Color.red)
919
+ msg = f"Agent {self.id()} _execute_tool failed with exception: {info['exception']}"
920
+ logger.info(f"Agent {self.id()} _execute_tool finished by tool action: {action}.")
921
+ log_ob = Observation(content='' if observation.content is None else observation.content,
922
+ action_result=observation.action_result)
923
+ trace_logger.info(f"{tool_name} observation: {log_ob}", color=Color.green)
924
+ self.memory.add(MemoryItem(
925
+ content=observation.content,
926
+ metadata={
927
+ "role": "tool",
928
+ "agent_name": self.id(),
929
+ "tool_call_id": action[0].tool_id
930
+ }
931
+ ))
932
+ return [ActionModel(agent_name=self.id(), policy_info=observation.content)]
933
+
934
+ def _init_context(self, context: Context):
935
+ super()._init_context(context)
936
+ # Generate default configuration when context_rule is empty
937
+ llm_config = self.conf.llm_config
938
+ context_rule = self.context_rule
939
+ if context_rule is None:
940
+ context_rule = ContextRuleConfig(
941
+ optimization_config=OptimizationConfig(
942
+ enabled=True,
943
+ max_token_budget_ratio=1.0
944
+ ),
945
+ llm_compression_config=LlmCompressionConfig(
946
+ enabled=False # Compression disabled by default
947
+ )
948
+ )
949
+ self.agent_context.set_model_config(llm_config)
950
+ self.agent_context.context_rule = context_rule
951
+ self.agent_context.system_prompt = self.system_prompt
952
+ self.agent_context.agent_prompt = self.agent_prompt
953
+ logger.debug(f'init_context llm_agent {self.name()} {self.agent_context} {self.conf} {self.context_rule}')
954
+
955
+ def update_system_prompt(self, system_prompt: str):
956
+ self.system_prompt = system_prompt
957
+ self.agent_context.system_prompt = system_prompt
958
+ logger.info(f"Agent {self.name()} system_prompt updated")
959
+
960
+ def update_agent_prompt(self, agent_prompt: str):
961
+ self.agent_prompt = agent_prompt
962
+ self.agent_context.agent_prompt = agent_prompt
963
+ logger.info(f"Agent {self.name()} agent_prompt updated")
964
+
965
+ def update_context_rule(self, context_rule: ContextRuleConfig):
966
+ self.agent_context.context_rule = context_rule
967
+ logger.info(f"Agent {self.name()} context_rule updated")
968
+
969
+ def update_context_usage(self, used_context_length: int = None, total_context_length: int = None):
970
+ self.agent_context.update_context_usage(used_context_length, total_context_length)
971
+ logger.debug(f"Agent {self.name()} context usage updated: {self.agent_context.context_usage}")
972
+
973
+ def update_llm_output(self, llm_response: ModelResponse):
974
+ self.agent_context.set_llm_output(llm_response)
975
+ logger.debug(f"Agent {self.name()} llm output updated: {self.agent_context.llm_output}")
976
+
977
+ async def run_hooks(self, context: Context, hook_point: str):
978
+ """Execute hooks asynchronously"""
979
+ from aworld.runners.hook.hook_factory import HookFactory
980
+ from aworld.core.event.base import Message
981
+
982
+ # Get all hooks for the specified hook point
983
+ all_hooks = HookFactory.hooks(hook_point)
984
+ hooks = all_hooks.get(hook_point, [])
985
+
986
+ for hook in hooks:
987
+ try:
988
+ # Create a temporary Message object to pass to the hook
989
+ message = Message(
990
+ category="agent_hook",
991
+ payload=None,
992
+ sender=self.id(),
993
+ session_id=context.session_id if hasattr(context, 'session_id') else None,
994
+ headers={"context": self.context}
995
+ )
996
+
997
+ # Execute hook
998
+ msg = await hook.exec(message, context)
999
+ if msg:
1000
+ logger.debug(f"Hook {hook.point()} executed successfully")
1001
+ yield msg
1002
+ except Exception as e:
1003
+ logger.warning(f"Hook {hook.point()} execution failed: {traceback.format_exc()}")
1004
+
1005
+ def _run_hooks_sync(self, context: Context, hook_point: str):
1006
+ """Execute hooks synchronously"""
1007
+ # Use sync_exec to execute asynchronous hook logic
1008
+ try:
1009
+ sync_exec(self.run_hooks, context, hook_point)
1010
+ except Exception as e:
1011
+ logger.warn(f"Failed to execute hooks for {hook_point}: {traceback.format_exc()}")
aworld/agents/loop_llm_agent.py ADDED
@@ -0,0 +1,46 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # coding: utf-8
2
+ # Copyright (c) 2025 inclusionAI.
3
+ from typing import Any, Callable
4
+
5
+ from aworld.agents.llm_agent import Agent
6
+
7
+
8
+ class LoopableAgent(Agent):
9
+ """Support for loop agents in the swarm.
10
+
11
+ The parameters of the extension function are the agent itself, which can obtain internal information of the agent.
12
+ `stop_func` function example:
13
+ >>> def stop(agent: LoopableAgent):
14
+ >>> ...
15
+
16
+ `loop_point_finder` function example:
17
+ >>> def find(agent: LoopableAgent):
18
+ >>> ...
19
+ """
20
+ max_run_times: int = 1
21
+ cur_run_times: int = 0
22
+ # The loop agent special the loop point (agent name)
23
+ loop_point: str = None
24
+ # Used to determine the loop point for multiple loops
25
+ loop_point_finder: Callable[..., Any] = None
26
+ # def stop(agent: LoopableAgent): ...
27
+ stop_func: Callable[..., Any] = None
28
+
29
+ @property
30
+ def goto(self):
31
+ """The next loop point is what the loop agent wants to reach."""
32
+ if self.loop_point_finder:
33
+ return self.loop_point_finder(self)
34
+ if self.loop_point:
35
+ return self.loop_point
36
+ return self.id()
37
+
38
+ @property
39
+ def finished(self) -> bool:
40
+ """Loop agent termination state detection, achieved loop count or termination condition."""
41
+ if self.cur_run_times >= self.max_run_times or (self.stop_func and self.stop_func(self)):
42
+ self._finished = True
43
+ return True
44
+
45
+ self._finished = False
46
+ return False
aworld/agents/parallel_llm_agent.py ADDED
@@ -0,0 +1,54 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # coding: utf-8
2
+ # Copyright (c) 2025 inclusionAI.
3
+ from typing import List, Dict, Any, Callable
4
+
5
+ from aworld.agents.llm_agent import Agent
6
+ from aworld.config import RunConfig
7
+ from aworld.core.common import Observation, ActionModel, Config
8
+
9
+
10
+ class ParallelizableAgent(Agent):
11
+ """Support for parallel agents in the swarm.
12
+
13
+ The parameters of the extension function are the agent itself, which can obtain internal information of the agent.
14
+ `aggregate_func` function example:
15
+ >>> def agg(agent: ParallelizableAgent, res: Dict[str, List[ActionModel]]):
16
+ >>> ...
17
+ """
18
+
19
+ def __init__(self,
20
+ conf: Config,
21
+ resp_parse_func: Callable[..., Any] = None,
22
+ agents: List[Agent] = [],
23
+ aggregate_func: Callable[..., Any] = None,
24
+ **kwargs):
25
+ super().__init__(conf=conf, resp_parse_func=resp_parse_func, **kwargs)
26
+ self.agents = agents
27
+ # The function of aggregating the results of the parallel execution of agents.
28
+ self.aggregate_func = aggregate_func
29
+
30
+ async def async_policy(self, observation: Observation, info: Dict[str, Any] = {}, **kwargs) -> List[ActionModel]:
31
+ from aworld.core.task import Task
32
+ from aworld.runners.utils import choose_runners, execute_runner
33
+
34
+ tasks = []
35
+ if self.agents:
36
+ for agent in self.agents:
37
+ tasks.append(Task(input=observation, agent=agent, context=self.context))
38
+
39
+ if not tasks:
40
+ raise RuntimeError("no task need to run in parallelizable agent.")
41
+
42
+ runners = await choose_runners(tasks)
43
+ res = await execute_runner(runners, RunConfig(reuse_process=False))
44
+
45
+ if self.aggregate_func:
46
+ return self.aggregate_func(self, res)
47
+
48
+ results = []
49
+ for k, v in res.items():
50
+ results.append(ActionModel(agent_name=self.id(), policy_info=v.answer))
51
+ return results
52
+
53
+ def finished(self) -> bool:
54
+ return all([agent.finished for agent in self.agents])
aworld/agents/serial_llm_agent.py ADDED
@@ -0,0 +1,48 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # coding: utf-8
2
+ # Copyright (c) 2025 inclusionAI.
3
+ from typing import List, Dict, Any, Callable
4
+
5
+ from aworld.agents.llm_agent import Agent
6
+ from aworld.core.common import Observation, ActionModel, Config
7
+ from aworld.logs.util import logger
8
+
9
+
10
+ class SerialableAgent(Agent):
11
+ """Support for serial execution of agents based on dependency relationships in the swarm."""
12
+
13
+ def __init__(self,
14
+ conf: Config,
15
+ resp_parse_func: Callable[..., Any] = None,
16
+ agents: List[Agent] = [],
17
+ **kwargs):
18
+ super().__init__(conf=conf, resp_parse_func=resp_parse_func, **kwargs)
19
+ self.agents = agents
20
+
21
+ async def async_policy(self, observation: Observation, info: Dict[str, Any] = {}, **kwargs) -> List[ActionModel]:
22
+ from aworld.config import RunConfig
23
+ from aworld.core.task import Task
24
+ from aworld.runners.utils import choose_runners, execute_runner
25
+
26
+ action = ActionModel(agent_name=self.id(), policy_info=observation.content)
27
+ for agent in self.agents:
28
+ task = Task(input=observation, agent=agent, context=self.context)
29
+ runners = await choose_runners([task])
30
+ res = await execute_runner(runners, RunConfig(reuse_process=False))
31
+ if res:
32
+ v = res.get(task.id)
33
+ action = ActionModel(agent_name=self.id(), policy_info=v.answer)
34
+ observation = self._action_to_observation(action, agent.name())
35
+ else:
36
+ raise Exception(f"{agent.id()} execute fail.")
37
+ return [action]
38
+
39
+ def _action_to_observation(self, policy: ActionModel, agent_name: str):
40
+ if not policy:
41
+ logger.warning("no agent policy, will use default error info.")
42
+ return Observation(content=f"{agent_name} no policy")
43
+
44
+ logger.debug(f"{policy.policy_info}")
45
+ return Observation(content=policy.policy_info)
46
+
47
+ def finished(self) -> bool:
48
+ return all([agent.finished for agent in self.agents])