enable_context_summarization truncates message history between tool_calls and tool role messages, causing OpenAI 400 Error

### pipecat version

0.0.103

### Python version

3.13

### Operating System

Docker

### Issue description


When using `LLMAssistantAggregatorParams` with `enable_context_summarization = True`, an issue occurs when the context window reaches its limit and the older messages are pruned/summarized.

If the cutoff point for the kept window lands exactly between an `assistant` message that contains `tool_calls` and the subsequent `tool` message containing the function's result, the message history array is split improperly. This leaves an orphaned `tool` message in the history without its preceding `assistant` request.

When this broken message history is passed to the next LLM generation call, OpenAI rejects the payload because the schema is invalid, crashing the pipeline.

**Proposed Solution:** The truncation/summarization logic needs to be aware of tool call sequences. When cutting off the message history, it should never cut between an `assistant` message with `tool_calls` and the corresponding `tool` messages. It must either keep the entire tool call sequence or truncate the whole sequence.

`Before Summarization`:
<img width="1077" height="1159" alt="Image" src="https://github.com/user-attachments/assets/1e4e7a1e-606a-478b-9172-549f2aa4abfc" />

`After Summarization`:
<img width="1055" height="837" alt="Image" src="https://github.com/user-attachments/assets/509b6ad9-5d68-4107-a869-cfa90a637dc5" />


### Reproduction steps



1. Initialize a Pipecat pipeline using OpenAI as the LLM.
2. Configure the aggregator with `LLMAssistantAggregatorParams(enable_context_summarization=True)`.
3. Provide the LLM with tools/functions and trigger a conversation where a tool is called.
4. Continue the conversation until the context window limit is reached, triggering the summarization and message pruning.
5. Ensure the pruning boundary falls immediately after an `assistant` tool call but before the `tool` response.
6. Trigger the next LLM generation. The application will crash.



### Expected behavior


When context summarization reduces the message array, the slicing logic should check the roles and contents of the messages at the boundary. It should preserve the structural integrity of tool calls, ensuring a `tool` role message is always strictly preceded by an `assistant` message containing `tool_calls`.


### Actual behavior


The message history is truncated blindly based on window size. This strips away the `assistant` message that initiated the tool call, but leaves the `tool` response message in the active array. The subsequent API call to OpenAI fails.


### Logs

```shell
openai.BadRequestError: Error code: 400 - {'error': {'message': "Invalid parameter: messages with role 'tool' must be a response to a preceeding message with 'tool_calls'.", 'type': 'invalid_request_error', 'param': 'messages.[2].role', 'code': None}}
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

enable_context_summarization truncates message history between tool_calls and tool role messages, causing OpenAI 400 Error #3832

pipecat version

Python version

Operating System

Issue description

Reproduction steps

Expected behavior

Actual behavior

Logs

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

enable_context_summarization truncates message history between tool_calls and tool role messages, causing OpenAI 400 Error #3832

Description

pipecat version

Python version

Operating System

Issue description

Reproduction steps

Expected behavior

Actual behavior

Logs

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions