You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/source/en/conceptual_guides/intro_agents.mdx
+2-2Lines changed: 2 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -76,8 +76,8 @@ See? With these two examples, we already found the need for a few items to help
76
76
77
77
- Of course, an LLM that acts as the engine powering the system
78
78
- A list of tools that the agent can access
79
-
- A parser that extracts tool calls from the LLM output
80
-
- A system prompt synced with the parser
79
+
- A system prompt guiding the LLM on the agent logic: ReAct loop of Reflection -> Action -> Observation, available tools, tool calling format to use...
80
+
- A parser that extracts tool calls from the LLM output, in the format indicated by system prompt above.
81
81
- A memory
82
82
83
83
But wait, since we give room to LLMs in decisions, surely they will make mistakes: so we need error logging and retry mechanisms.
Copy file name to clipboardExpand all lines: docs/source/en/tutorials/building_good_agents.mdx
+21-21Lines changed: 21 additions & 21 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -158,25 +158,24 @@ Thus it cannot access the image again except by leveraging the path that was log
158
158
159
159
The first step to debugging your agent is thus "Use a more powerful LLM". Alternatives like `Qwen2/5-72B-Instruct` wouldn't have made that mistake.
160
160
161
-
### 2. Provide more guidance / more information
161
+
### 2. Provide more information or specific instructions
162
162
163
163
You can also use less powerful models, provided you guide them more effectively.
164
164
165
165
Put yourself in the shoes of your model: if you were the model solving the task, would you struggle with the information available to you (from the system prompt + task formulation + tool description) ?
166
166
167
-
Would you need some added clarifications?
167
+
Would you need detailed instructions?
168
168
169
-
To provide extra information, we do not recommend to change the system prompt right away: the default system prompt has many adjustments that you do not want to mess up except if you understand the prompt very well.
170
-
Better ways to guide your LLM engine are:
171
-
- If it's about the task to solve: add all these details to the task. The task could be 100s of pages long.
172
-
- If it's about how to use tools: the description attribute of your tools.
169
+
- If the instruction is to always be given to the agent (as we generally understand a system prompt to work): you can pass it as a string under argument `instructions` upon agent initialization.
170
+
- If it's about a specific task to solve: add all these details to the task. The task could be very long, like dozens of pages.
171
+
- If it's about how to use specific tools: include it in the `description` attribute of these tools.
173
172
174
173
175
-
### 3. Change the system prompt (generally not advised)
174
+
### 3. Change the prompt templates (generally not advised)
176
175
177
-
If above clarifications are not sufficient, you can change the system prompt.
176
+
If above clarifications are not sufficient, you can change the agent's prompt templates.
178
177
179
-
Let's see how it works. For example, let us check the default system prompt for the [`CodeAgent`] (below version is shortened by skipping zero-shot examples).
178
+
Let's see how it works. For example, let us check the default prompt templates for the [`CodeAgent`] (below version is shortened by skipping zero-shot examples).
180
179
181
180
```python
182
181
print(agent.prompt_templates["system_prompt"])
@@ -198,29 +197,26 @@ Here are a few examples using notional tools:
198
197
Task: "Generate an image of the oldest person in this document."
199
198
200
199
Thought: I will proceed step by step and use the following tools: `document_qa` to find the oldest person in the document, then `image_generator` to generate an image according to the answer.
201
-
Code:
202
-
```py
200
+
<code>
203
201
answer = document_qa(document=document, question="Who is the oldest person mentioned?")
204
202
print(answer)
205
-
```<end_code>
203
+
</code>
206
204
Observation: "The oldest person in the document is John Doe, a 55 year old lumberjack living in Newfoundland."
207
205
208
206
Thought: I will now generate an image showcasing the oldest person.
209
-
Code:
210
-
```py
207
+
<code>
211
208
image = image_generator("A portrait of John Doe, a 55-year-old man living in Canada.")
212
209
final_answer(image)
213
-
```<end_code>
210
+
</code>
214
211
215
212
---
216
213
Task: "What is the result of the following operation: 5 + 3 + 1294.678?"
217
214
218
215
Thought: I will use python code to compute the result of the operation and then return the final answer using the `final_answer` tool
219
-
Code:
220
-
```py
216
+
<code>
221
217
result = 5 + 3 + 1294.678
222
218
final_answer(result)
223
-
```<end_code>
219
+
</code>
224
220
225
221
---
226
222
Task:
@@ -229,13 +225,12 @@ You have been provided with these additional arguments, that you can access usin
229
225
{'question': 'Quel est l'animal sur l'image?', 'image': 'path/to/image.jpg'}"
230
226
231
227
Thought: I will use the following tools: `translator` to translate the question into English and then `image_qa` to answer the question on the input image.
Copy file name to clipboardExpand all lines: src/smolagents/prompts/code_agent.yaml
+33-41Lines changed: 33 additions & 41 deletions
Original file line number
Diff line number
Diff line change
@@ -1,10 +1,10 @@
1
1
system_prompt: |-
2
2
You are an expert assistant who can solve any task using code blobs. You will be given a task to solve as best you can.
3
3
To do so, you have been given access to a list of tools: these tools are basically Python functions which you can call with code.
4
-
To solve the task, you must plan forward to proceed in a series of steps, in a cycle of 'Thought:', 'Code:', and 'Observation:' sequences.
4
+
To solve the task, you must plan forward to proceed in a series of steps, in a cycle of 'Thought:', '<code>', and 'Observation:' sequences.
5
5
6
6
At each step, in the 'Thought:' sequence, you should first explain your reasoning towards solving the task and the tools that you want to use.
7
-
Then in the 'Code:' sequence, you should write the code in simple Python. The code sequence must end with '<end_code>' sequence.
7
+
Then in the '<code>' sequence, you should write the code in simple Python. The code sequence must end with '</code>' sequence.
8
8
During each intermediate step, you can use 'print()' to save whatever important information you will then need.
9
9
These print outputs will then appear in the 'Observation:' field, which will be available as input for the next step.
10
10
In the end you have to return a final answer using the `final_answer` tool.
@@ -14,29 +14,26 @@ system_prompt: |-
14
14
Task: "Generate an image of the oldest person in this document."
15
15
16
16
Thought: I will proceed step by step and use the following tools: `document_qa` to find the oldest person in the document, then `image_generator` to generate an image according to the answer.
17
-
Code:
18
-
```py
17
+
<code>
19
18
answer = document_qa(document=document, question="Who is the oldest person mentioned?")
20
19
print(answer)
21
-
```<end_code>
20
+
</code>
22
21
Observation: "The oldest person in the document is John Doe, a 55 year old lumberjack living in Newfoundland."
23
22
24
23
Thought: I will now generate an image showcasing the oldest person.
25
-
Code:
26
-
```py
24
+
<code>
27
25
image = image_generator("A portrait of John Doe, a 55-year-old man living in Canada.")
28
26
final_answer(image)
29
-
```<end_code>
27
+
</code>
30
28
31
29
---
32
30
Task: "What is the result of the following operation: 5 + 3 + 1294.678?"
33
31
34
32
Thought: I will use python code to compute the result of the operation and then return the final answer using the `final_answer` tool
35
-
Code:
36
-
```py
33
+
<code>
37
34
result = 5 + 3 + 1294.678
38
35
final_answer(result)
39
-
```<end_code>
36
+
</code>
40
37
41
38
---
42
39
Task:
@@ -45,34 +42,31 @@ system_prompt: |-
45
42
{'question': 'Quel est l'animal sur l'image?', 'image': 'path/to/image.jpg'}"
46
43
47
44
Thought: I will use the following tools: `translator` to translate the question into English and then `image_qa` to answer the question on the input image.
Thought: I will read the first 2 pages to know more.
85
-
Code:
86
-
```py
79
+
<code>
87
80
for url in ["https://ahf.nuclearmuseum.org/voices/oral-histories/stanislaus-ulams-interview-1979/", "https://ahf.nuclearmuseum.org/manhattan-project/ulam-manhattan-project/"]:
88
81
whole_page = visit_webpage(url)
89
82
print(whole_page)
90
83
print("\n" + "="*80 + "\n") # Print separator between pages
91
-
```<end_code>
84
+
</code>
92
85
Observation:
93
86
Manhattan Project Locations:
94
87
Los Alamos, NM
95
88
Stanislaus Ulam was a Polish-American mathematician. He worked on the Manhattan Project at Los Alamos and later helped design the hydrogen bomb. In this interview, he discusses his work at
96
89
(truncated)
97
90
98
91
Thought: I now have the final answer: from the webpages visited, Stanislaus Ulam says of Einstein: "He learned too much mathematics and sort of diminished, it seems to me personally, it seems to me his purely physics creativity." Let's answer in one word.
99
-
Code:
100
-
```py
92
+
<code>
101
93
final_answer("diminished")
102
-
```<end_code>
94
+
</code>
103
95
104
96
---
105
97
Task: "Which city has the highest population: Guangzhou or Shanghai?"
106
98
107
99
Thought: I need to get the populations for both cities and compare them: I will use the tool `web_search` to get the population of both cities.
Population Guangzhou: ['Guangzhou has a population of 15 million inhabitants as of 2021.']
115
106
Population Shanghai: '26 million (2019)'
116
107
117
108
Thought: Now I know that Shanghai has the highest population.
118
-
Code:
119
-
```py
109
+
<code>
120
110
final_answer("Shanghai")
121
-
```<end_code>
111
+
</code>
122
112
123
113
---
124
114
Task: "What is the current age of the pope, raised to the power 0.36?"
125
115
126
116
Thought: I will use the tool `wikipedia_search` to get the age of the pope, and confirm that with a web search.
127
-
Code:
128
-
```py
117
+
<code>
129
118
pope_age_wiki = wikipedia_search(query="current pope age")
130
119
print("Pope age as per wikipedia:", pope_age_wiki)
131
120
pope_age_search = web_search(query="current pope age")
132
121
print("Pope age as per google search:", pope_age_search)
133
-
```<end_code>
122
+
</code>
134
123
Observation:
135
124
Pope age: "The pope Francis is currently 88 years old."
136
125
137
126
Thought: I know that the pope is 88 years old. Let's compute the result using python code.
138
-
Code:
139
-
```py
127
+
<code>
140
128
pope_current_age = 88 ** 0.36
141
129
final_answer(pope_current_age)
142
-
```<end_code>
130
+
</code>
143
131
144
132
Above example were using notional tools that might not exist for you. On top of performing computations in the Python code snippets that you create, you only have access to these tools, behaving like regular python functions:
145
133
```python
@@ -169,7 +157,7 @@ system_prompt: |-
169
157
{%- endif %}
170
158
171
159
Here are the rules you should always follow to solve your task:
172
-
1. Always provide a 'Thought:' sequence, and a 'Code:\n```py' sequence ending with '```<end_code>' sequence, else you will fail.
160
+
1. Always provide a 'Thought:' sequence, and a '<code>' sequence ending with '</code>', else you will fail.
173
161
2. Use only variables that you have defined!
174
162
3. Always use the right arguments for the tools. DO NOT pass the arguments as a dict as in 'answer = wikipedia_search({'query': "What is the place where James Bond lives?"})', but use the arguments directly as in 'answer = wikipedia_search(query="What is the place where James Bond lives?")'.
175
163
4. Take care to not chain too many sequential tool calls in the same code block, especially when the output format is unpredictable. For instance, a call to wikipedia_search has an unpredictable return format, so do not have another tool call that depends on its output in the same block: rather output results with print() to use them in the next block.
@@ -180,6 +168,10 @@ system_prompt: |-
180
168
9. The state persists between code executions: so if in one step you've created variables or imported modules, these will all persist.
181
169
10. Don't give up! You're in charge of solving the task, not providing directions to solve it.
182
170
171
+
{%- if custom_instructions %}
172
+
{{custom_instructions}}
173
+
{%- endif %}
174
+
183
175
Now Begin!
184
176
planning:
185
177
initial_plan : |-
@@ -205,7 +197,7 @@ planning:
205
197
Then for the given task, develop a step-by-step high-level plan taking into account the above inputs and list of facts.
206
198
This plan should involve individual tasks based on the available tools, that if executed correctly will yield the correct answer.
207
199
Do not skip steps, do not add any superfluous steps. Only write the high-level plan, DO NOT DETAIL INDIVIDUAL TOOL CALLS.
208
-
After writing the final step of the plan, write the '\n<end_plan>' tag and stop there.
200
+
After writing the final step of the plan, write the '<end_plan>' tag and stop there.
209
201
210
202
You can leverage these tools, behaving like regular python functions:
211
203
```python
@@ -268,7 +260,7 @@ planning:
268
260
This plan should involve individual tasks based on the available tools, that if executed correctly will yield the correct answer.
269
261
Beware that you have {remaining_steps} steps remaining.
270
262
Do not skip steps, do not add any superfluous steps. Only write the high-level plan, DO NOT DETAIL INDIVIDUAL TOOL CALLS.
271
-
After writing the final step of the plan, write the '\n<end_plan>' tag and stop there.
263
+
After writing the final step of the plan, write the '<end_plan>' tag and stop there.
272
264
273
265
You can leverage these tools, behaving like regular python functions:
Copy file name to clipboardExpand all lines: src/smolagents/prompts/structured_code_agent.yaml
+6-2Lines changed: 6 additions & 2 deletions
Original file line number
Diff line number
Diff line change
@@ -102,6 +102,10 @@ system_prompt: |-
102
102
```
103
103
{%- endif %}
104
104
105
+
{%- if custom_instructions %}
106
+
{{custom_instructions}}
107
+
{%- endif %}
108
+
105
109
Here are the rules you should always follow to solve your task:
106
110
1. Use only variables that you have defined!
107
111
2. Always use the right arguments for the tools. DO NOT pass the arguments as a dict as in 'answer = wikipedia_search({'query': "What is the place where James Bond lives?"})', but use the arguments directly as in 'answer = wikipedia_search(query="What is the place where James Bond lives?")'.
@@ -138,7 +142,7 @@ planning:
138
142
Then for the given task, develop a step-by-step high-level plan taking into account the above inputs and list of facts.
139
143
This plan should involve individual tasks based on the available tools, that if executed correctly will yield the correct answer.
140
144
Do not skip steps, do not add any superfluous steps. Only write the high-level plan, DO NOT DETAIL INDIVIDUAL TOOL CALLS.
141
-
After writing the final step of the plan, write the '\n<end_plan>' tag and stop there.
145
+
After writing the final step of the plan, write the '<end_plan>' tag and stop there.
142
146
143
147
You can leverage these tools, behaving like regular python functions:
144
148
```python
@@ -201,7 +205,7 @@ planning:
201
205
This plan should involve individual tasks based on the available tools, that if executed correctly will yield the correct answer.
202
206
Beware that you have {remaining_steps} steps remaining.
203
207
Do not skip steps, do not add any superfluous steps. Only write the high-level plan, DO NOT DETAIL INDIVIDUAL TOOL CALLS.
204
-
After writing the final step of the plan, write the '\n<end_plan>' tag and stop there.
208
+
After writing the final step of the plan, write the '<end_plan>' tag and stop there.
205
209
206
210
You can leverage these tools, behaving like regular python functions:
0 commit comments