fix: Gemini turn ordering and SSE streaming parser#723
fix: Gemini turn ordering and SSE streaming parser#723boxcee wants to merge 2 commits intoRightNow-AI:mainfrom
Conversation
Three bugs fixed in the Gemini driver: 1. Function call turn ordering: Gemini requires model turns with functionCall to be immediately followed by user turns with functionResponse. The agent loop could insert text-only turns between them (e.g. "[no response]", "Please continue"), causing INVALID_ARGUMENT 400 errors. 2. First turn must be user: Gemini rejects conversations starting with a model turn. After session trimming or compaction, the first message could be a model turn with functionCall parts. Now prepends a synthetic user turn when needed. 3. SSE streaming parser: The parser used \n\n as the SSE event delimiter but Gemini returns \r\n\r\n (HTTP standard). Since \r\n\r\n does not contain the substring \n\n, no events were ever parsed, causing 0 token responses and crash loops. Fixed by normalizing \r\n to \n before delimiter matching. Also adds debug logging for turn structure, request/response bodies, and SSE stream diagnostics.
jaberjaber23
left a comment
There was a problem hiding this comment.
The bugs being fixed are real and important — the SSE \r\n issue and turn ordering constraint are genuine Gemini API problems.
Blocking issues:
-
truncate_for_logwill panic on multi-byte UTF-8 input.&s[..max_len]indexes by byte offset — if it falls in the middle of a multi-byte character (Japanese, emoji, non-ASCII error messages from Google), Rust panics. Fix withs.floor_char_boundary(max_len)(stable since 1.76). -
buffer.replace("\r\n", "\n")runs on the entire accumulated buffer every chunk, creating O(n*m) behavior. Move normalization to the chunk level:buffer.push_str(&chunk_str.replace("\r\n", "\n"))instead of appending then replacing the full buffer. -
No tests for the 70-line
enforce_function_call_orderingfunction. This is critical LLM infrastructure — needs tests covering: functionCall with intervening text, orphaned functionCall, conversation starting with model turn, consecutive same-role merging.
|
The first issue of turn by turn restrictions i feel applies to other LLMs, could it be that we need to change something fundamentally in the chat behavior - essentially you queue messages to be sent after its ready to accept (e.g. completed the function call response and reply)? This applies also to heartbeat that in such case=, without constrains will cause "hearth failure" :) |
…ization, ordering tests 1. truncate_for_log: walk back to a valid char boundary to avoid panic on multi-byte UTF-8 input (no MSRV change needed). 2. SSE buffer: normalize \r\n per chunk before appending instead of re-scanning the entire accumulated buffer each iteration (O(n*m) → O(n)). 3. Add tests for enforce_function_call_ordering covering: - functionCall with intervening text removed - orphaned functionCall stripped - conversation starting with model turn gets synthetic user turn - consecutive same-role merging - valid ordering passthrough 4. Add test for truncate_for_log with multi-byte UTF-8.
6972586 to
0a6f718
Compare
Checked all drivers — this constraint is Gemini-specific. Other APIs (Anthropic, OpenAI) don't have strict function call ordering requirements. The heartbeat concern is handled: |
Three bugs fixed in the Gemini driver:
Function call turn ordering: Gemini requires model turns with functionCall to be immediately followed by user turns with functionResponse. The agent loop could insert text-only turns between them (e.g. "[no response]", "Please continue"), causing INVALID_ARGUMENT 400 errors.
First turn must be user: Gemini rejects conversations starting with a model turn. After session trimming or compaction, the first message could be a model turn with functionCall parts. Now prepends a synthetic user turn when needed.
SSE streaming parser: The parser used \n\n as the SSE event delimiter but Gemini returns \r\n\r\n (HTTP standard). Since \r\n\r\n does not contain the substring \n\n, no events were ever parsed, causing 0 token responses and crash loops. Fixed by normalizing \r\n to \n before delimiter matching.
Also adds debug logging for turn structure, request/response bodies, and SSE stream diagnostics.
Summary
Changes
Testing
cargo clippy --workspace --all-targets -- -D warningspassescargo test --workspacepassesSecurity