Skip to content
Closed
Show file tree
Hide file tree
Changes from 18 commits
Commits
Show all changes
22 commits
Select commit Hold shift + click to select a range
958d0dc
Pass parallel_tool_calls directly and document intended usage in inte…
anastasds Nov 19, 2025
179d5f7
Merge branch 'main' into parallel-tool-calls-impl
anastasds Nov 19, 2025
0c2b82b
Merge branch 'main' into parallel-tool-calls-impl
franciscojavierarceo Nov 21, 2025
9cbb624
Merge branch 'main' into parallel-tool-calls-impl
anastasds Dec 16, 2025
1f6e095
Delete parallel tool calls section from documentation because it has …
anastasds Nov 21, 2025
a971d8f
Document behavior of parallel_tool_calls parameter
anastasds Nov 21, 2025
bbd59b7
Run pre-commit hooks
anastasds Dec 16, 2025
a6552c0
Vendor images for parallel_tool_calls docs
anastasds Dec 16, 2025
cf75274
--signoff
anastasds Dec 16, 2025
d531c57
Default parallel_tool_calls to none
anastasds Dec 16, 2025
5c48e0a
Run latest pre-commit hooks
anastasds Dec 16, 2025
7fda120
Merge branch 'main' into parallel-tool-calls-impl
anastasds Dec 16, 2025
d3acf47
Run latest pre-commit hooks
anastasds Dec 16, 2025
d14d930
Merge branch 'main' into parallel-tool-calls-impl
anastasds Dec 16, 2025
e2030f7
Default parallel_tool_calls to None everywhere
anastasds Dec 16, 2025
983fa48
Rerun pre commit hooks
anastasds Dec 16, 2025
5214ccf
Merge branch 'main' into parallel-tool-calls-impl
anastasds Dec 16, 2025
271af12
Merge branch 'pdocs' into parallel-tool-calls-impl
anastasds Dec 16, 2025
2610b4f
Return parallel_tool_calls default to true
anastasds Dec 17, 2025
8a924a3
Add missing replays for integration tests
anastasds Dec 17, 2025
d7f8d6b
Merge branch 'main' into parallel-tool-calls-impl
anastasds Dec 17, 2025
5514378
Merge branch 'main' into parallel-tool-calls-impl
anastasds Dec 18, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 0 additions & 3 deletions client-sdks/stainless/openapi.yml
Original file line number Diff line number Diff line change
Expand Up @@ -6921,7 +6921,6 @@ components:
anyOf:
- type: boolean
- type: 'null'
default: true
previous_response_id:
anyOf:
- type: string
Expand Down Expand Up @@ -7363,7 +7362,6 @@ components:
anyOf:
- type: boolean
- type: 'null'
default: true
previous_response_id:
anyOf:
- type: string
Expand Down Expand Up @@ -7531,7 +7529,6 @@ components:
anyOf:
- type: boolean
- type: 'null'
default: true
previous_response_id:
anyOf:
- type: string
Expand Down
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
24 changes: 16 additions & 8 deletions docs/docs/providers/openai_responses_limitations.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -262,14 +262,6 @@ OpenAI provides a [prompt caching](https://platform.openai.com/docs/guides/promp

---

### Parallel Tool Calls

**Status:** Rumored Issue

There are reports that `parallel_tool_calls` may not work correctly. This needs verification and a ticket should be opened if confirmed.

---

## Resolved Issues

The following limitations have been addressed in recent releases:
Expand Down Expand Up @@ -297,3 +289,19 @@ The `require_approval` parameter for MCP tools in the Responses API now works co
**Fixed in:** [#3003](https://github.com/llamastack/llama-stack/pull/3003) (Agent API), [#3602](https://github.com/llamastack/llama-stack/pull/3602) (Responses API)

MCP tools now correctly handle array-type arguments in both the Agent API and Responses API.

---

### Parallel tool calls

**Status:** ✅ Resolved

The [`parallel_tool_calls` parameter](https://platform.openai.com/docs/api-reference/responses/create#responses_create-parallel_tool_calls) controls turn-based function calling workflows, _not_ parallelism or concurrency. See the [related function calling documentation](https://platform.openai.com/docs/guides/function-calling#parallel-function-calling).

If `parallel_tool_calls=false`, the intended behavior is that multiple generated functional calls will be executed once per turn until done; the client is responsible for executing them one at a time and returning the result, in the expected format, in order to proceed.

For example, with a custom tool generation request with a `get_weather` function definition, the input of "What is the weather in Tokyo and New York?" will, by default, cause two function calls to be generated - a `get_weather` function call definition for each of `Paris` and `New York`. With `parallel_tool_calls = false`, however, only one of these will be generated initially; the client is then responsible for executing that function call and appending the results to the message history, after which the conversation will proceed with the model-generated second function tool call definition.

| parallel_tool_calls=true | parallel_tool_calls=false |
|------|-------|
| <img width="1134" height="1330" alt="Image" src="img/parallel-tool-calls-true.png" /> | <img width="1236" height="1868" alt="Image" src="img/parallel-tool-calls-false.png" /> |
3 changes: 0 additions & 3 deletions docs/static/deprecated-llama-stack-spec.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -3747,7 +3747,6 @@ components:
anyOf:
- type: boolean
- type: 'null'
default: true
previous_response_id:
anyOf:
- type: string
Expand Down Expand Up @@ -4189,7 +4188,6 @@ components:
anyOf:
- type: boolean
- type: 'null'
default: true
previous_response_id:
anyOf:
- type: string
Expand Down Expand Up @@ -4357,7 +4355,6 @@ components:
anyOf:
- type: boolean
- type: 'null'
default: true
previous_response_id:
anyOf:
- type: string
Expand Down
2 changes: 0 additions & 2 deletions docs/static/experimental-llama-stack-spec.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -3466,7 +3466,6 @@ components:
anyOf:
- type: boolean
- type: 'null'
default: true
previous_response_id:
anyOf:
- type: string
Expand Down Expand Up @@ -3904,7 +3903,6 @@ components:
anyOf:
- type: boolean
- type: 'null'
default: true
previous_response_id:
anyOf:
- type: string
Expand Down
3 changes: 0 additions & 3 deletions docs/static/llama-stack-spec.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -5563,7 +5563,6 @@ components:
anyOf:
- type: boolean
- type: 'null'
default: true
previous_response_id:
anyOf:
- type: string
Expand Down Expand Up @@ -6005,7 +6004,6 @@ components:
anyOf:
- type: boolean
- type: 'null'
default: true
previous_response_id:
anyOf:
- type: string
Expand Down Expand Up @@ -6173,7 +6171,6 @@ components:
anyOf:
- type: boolean
- type: 'null'
default: true
previous_response_id:
anyOf:
- type: string
Expand Down
3 changes: 0 additions & 3 deletions docs/static/stainless-llama-stack-spec.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -6921,7 +6921,6 @@ components:
anyOf:
- type: boolean
- type: 'null'
default: true
previous_response_id:
anyOf:
- type: string
Expand Down Expand Up @@ -7363,7 +7362,6 @@ components:
anyOf:
- type: boolean
- type: 'null'
default: true
previous_response_id:
anyOf:
- type: string
Expand Down Expand Up @@ -7531,7 +7529,6 @@ components:
anyOf:
- type: boolean
- type: 'null'
default: true
previous_response_id:
anyOf:
- type: string
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -99,7 +99,7 @@ async def create_openai_response(
model: str,
prompt: OpenAIResponsePrompt | None = None,
instructions: str | None = None,
parallel_tool_calls: bool | None = True,
parallel_tool_calls: bool | None = None,
previous_response_id: str | None = None,
conversation: str | None = None,
store: bool | None = True,
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -450,7 +450,7 @@ async def _create_streaming_response(
tool_choice: OpenAIResponseInputToolChoice | None = None,
max_infer_iters: int | None = 10,
guardrail_ids: list[str] | None = None,
parallel_tool_calls: bool | None = True,
parallel_tool_calls: bool | None = None,
max_tool_calls: int | None = None,
metadata: dict[str, str] | None = None,
include: list[ResponseItemInclude] | None = None,
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -315,6 +315,7 @@ async def create_response(self) -> AsyncIterator[OpenAIResponseObjectStream]:
model=self.ctx.model,
messages=messages,
# Pydantic models are dict-compatible but mypy treats them as distinct types
parallel_tool_calls=self.parallel_tool_calls,
tools=effective_tools, # type: ignore[arg-type]
tool_choice=chat_tool_choice,
stream=True,
Expand Down
2 changes: 1 addition & 1 deletion src/llama_stack_api/agents.py
Original file line number Diff line number Diff line change
Expand Up @@ -88,7 +88,7 @@ async def create_openai_response(
model: str,
prompt: OpenAIResponsePrompt | None = None,
instructions: str | None = None,
parallel_tool_calls: bool | None = True,
parallel_tool_calls: bool | None = None,
previous_response_id: str | None = None,
conversation: str | None = None,
store: bool | None = True,
Expand Down
2 changes: 1 addition & 1 deletion src/llama_stack_api/openai_responses.py
Original file line number Diff line number Diff line change
Expand Up @@ -709,7 +709,7 @@ class OpenAIResponseObject(BaseModel):
model: str
object: Literal["response"] = "response"
output: Sequence[OpenAIResponseOutput]
parallel_tool_calls: bool | None = True
parallel_tool_calls: bool | None = None
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why the api change?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This parameter is optional and so when being passed downstream, should not be set unless set by the user in their initial request. If a downstream provider defaults to False, that should not be overriden by default.

OpenAI currently defaults to true internally, but that may change, the parameter may be removed, etc., so best to not explicitly set it. (I originally defined this to be True by default and should not have.)

previous_response_id: str | None = None
prompt: OpenAIResponsePrompt | None = None
status: str
Expand Down
Loading