-
Notifications
You must be signed in to change notification settings - Fork 1.8k
Description
pipecat version
0.0.103
Python version
3.13
Operating System
Docker
Issue description
When using LLMAssistantAggregatorParams with enable_context_summarization = True, an issue occurs when the context window reaches its limit and the older messages are pruned/summarized.
If the cutoff point for the kept window lands exactly between an assistant message that contains tool_calls and the subsequent tool message containing the function's result, the message history array is split improperly. This leaves an orphaned tool message in the history without its preceding assistant request.
When this broken message history is passed to the next LLM generation call, OpenAI rejects the payload because the schema is invalid, crashing the pipeline.
Proposed Solution: The truncation/summarization logic needs to be aware of tool call sequences. When cutting off the message history, it should never cut between an assistant message with tool_calls and the corresponding tool messages. It must either keep the entire tool call sequence or truncate the whole sequence.
Reproduction steps
- Initialize a Pipecat pipeline using OpenAI as the LLM.
- Configure the aggregator with
LLMAssistantAggregatorParams(enable_context_summarization=True). - Provide the LLM with tools/functions and trigger a conversation where a tool is called.
- Continue the conversation until the context window limit is reached, triggering the summarization and message pruning.
- Ensure the pruning boundary falls immediately after an
assistanttool call but before thetoolresponse. - Trigger the next LLM generation. The application will crash.
Expected behavior
When context summarization reduces the message array, the slicing logic should check the roles and contents of the messages at the boundary. It should preserve the structural integrity of tool calls, ensuring a tool role message is always strictly preceded by an assistant message containing tool_calls.
Actual behavior
The message history is truncated blindly based on window size. This strips away the assistant message that initiated the tool call, but leaves the tool response message in the active array. The subsequent API call to OpenAI fails.
Logs
openai.BadRequestError: Error code: 400 - {'error': {'message': "Invalid parameter: messages with role 'tool' must be a response to a preceeding message with 'tool_calls'.", 'type': 'invalid_request_error', 'param': 'messages.[2].role', 'code': None}}
