Skip to content

Commit 968c567

Browse files
fix(openai): align reasoning capture with codex official pattern
Three changes to bring fantasy's Responses API streaming into parity with OpenAI's official codex CLI (codex-rs/codex-api/src/sse/responses.rs): 1. **Defer activeReasoning cleanup to end-of-stream.** The previous implementation deleted the entry on `response.output_item.done`, which meant any subsequent event for the same reasoning item (e.g. late delta or duplicate done) silently dropped. The codex official parser keeps items addressable until the full stream completes. 2. **Capture done.Item.Summary on output_item.done.** The streaming summary delta path may already populate state.metadata.Summary via reasoning_summary_text.delta events, but the done event carries the authoritative final list. Prefer it when non-empty so partial-summary streams are corrected to the final shape. 3. **Add response.reasoning_text.delta handler.** Some gpt-5.x reasoning variants stream reasoning via this event channel (raw reasoning text keyed by ItemID + ContentIndex) instead of, or in addition to, reasoning_summary_text.delta. The official codex parser handles both; fantasy previously only handled the summary path, dropping raw reasoning text for affected models. Background: empirical lenos session 35dd39ec (codex 5.4 multi-turn audit) showed turn 1 captured encrypted_content cleanly via the existing output_item.done capture (PR charmbracelet#71's Fix 2), but follow-up turns and gpt-5.5 high sessions (ab022528) showed state.metadata.EncryptedContent stuck at empty despite the API streaming reasoning text. Investigation against the official codex CLI source + multiple reverse-engineered backend proxies (MetaFARS/codex-relay, hermes-agent issue #5732, satoriweb's protocol notes) confirmed: - response.completed.output is unreliable on the Codex backend (can be empty even when output_item.done events delivered the data). - The reasoning_text.delta event is a separate channel from reasoning_summary_text.delta; both must be handled to capture all thinking text emitted by gpt-5.x reasoning variants. This commit reverts the Fix 3 attempt (commit 7ce6466 — re-emitting ReasoningEnd from response.completed.output) which was based on the incorrect assumption that completed.output is the source of truth.
1 parent 549fffa commit 968c567

1 file changed

Lines changed: 58 additions & 11 deletions

File tree

providers/openai/responses_language_model.go

Lines changed: 58 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -1127,19 +1127,35 @@ func (o responsesLanguageModel) Stream(ctx context.Context, call fantasy.Call) (
11271127
case "reasoning":
11281128
state := activeReasoning[done.Item.ID]
11291129
if state != nil {
1130-
// The output_item.done event carries the FINAL
1131-
// encrypted_content blob for the reasoning item.
1132-
// The earlier output_item.added event for reasoning
1133-
// items typically does not include it (the item is
1134-
// still being generated). Capture it here so the
1135-
// blob is available for replay on subsequent turns
1136-
// (see also: ContentTypeReasoning case in
1137-
// toResponsesPrompt). Without this, encrypted_content
1138-
// is silently dropped and reasoning continuity is
1139-
// lost across requests when store=false.
1130+
// The output_item.done event for reasoning items is
1131+
// the SOURCE OF TRUTH for the final reasoning state
1132+
// (matches OpenAI's official codex CLI pattern: see
1133+
// codex-rs/codex-api/src/sse/responses.rs which
1134+
// extracts the full ResponseItem here and ignores
1135+
// response.completed.output, which is known to be
1136+
// empty for some Codex backend responses — refs
1137+
// hermes-agent issue #5732).
1138+
//
1139+
// Capture every available field so encrypted_content,
1140+
// summary, and any other surface the API populates is
1141+
// preserved for replay on subsequent turns (see
1142+
// ContentTypeReasoning case in toResponsesPrompt for
1143+
// the replay path).
11401144
if done.Item.EncryptedContent != "" {
11411145
state.metadata.EncryptedContent = &done.Item.EncryptedContent
11421146
}
1147+
// Pull final Summary from done.Item — the streaming
1148+
// reasoning_summary_text.delta path may have populated
1149+
// state.metadata.Summary already, but the done event
1150+
// carries the authoritative final list. Prefer it
1151+
// when non-empty.
1152+
if len(done.Item.Summary) > 0 {
1153+
finalSummary := make([]string, 0, len(done.Item.Summary))
1154+
for _, s := range done.Item.Summary {
1155+
finalSummary = append(finalSummary, s.Text)
1156+
}
1157+
state.metadata.Summary = finalSummary
1158+
}
11431159
if !yield(fantasy.StreamPart{
11441160
Type: fantasy.StreamPartTypeReasoningEnd,
11451161
ID: done.Item.ID,
@@ -1149,7 +1165,11 @@ func (o responsesLanguageModel) Stream(ctx context.Context, call fantasy.Call) (
11491165
}) {
11501166
return
11511167
}
1152-
delete(activeReasoning, done.Item.ID)
1168+
// Don't delete activeReasoning here — keep it through
1169+
// response.completed so any stragglers (e.g. items
1170+
// for which output_item.done never fires) can still be
1171+
// inspected/finalised. The map is cleared at the end
1172+
// of the stream when the function returns.
11531173
}
11541174
}
11551175

@@ -1251,6 +1271,33 @@ func (o responsesLanguageModel) Stream(ctx context.Context, call fantasy.Call) (
12511271
}
12521272
}
12531273

1274+
case "response.reasoning_text.delta":
1275+
// Some Codex backend models (notably gpt-5.x reasoning
1276+
// variants under certain conditions) stream reasoning via
1277+
// reasoning_text.delta events instead of (or in addition to)
1278+
// reasoning_summary_text.delta. Per the official codex CLI
1279+
// protocol parser (codex-rs/codex-api/src/sse/responses.rs
1280+
// case "response.reasoning_text.delta"), this event carries
1281+
// raw reasoning content keyed by ItemID + ContentIndex.
1282+
//
1283+
// We surface the delta as a ReasoningDelta to keep consumers
1284+
// (lenos and similar bash-protocol agents) seeing thinking
1285+
// text regardless of which event channel the model uses.
1286+
rawDelta := event.AsResponseReasoningTextDelta()
1287+
state := activeReasoning[rawDelta.ItemID]
1288+
if state != nil {
1289+
if !yield(fantasy.StreamPart{
1290+
Type: fantasy.StreamPartTypeReasoningDelta,
1291+
ID: rawDelta.ItemID,
1292+
Delta: rawDelta.Delta,
1293+
ProviderMetadata: fantasy.ProviderMetadata{
1294+
Name: state.metadata,
1295+
},
1296+
}) {
1297+
return
1298+
}
1299+
}
1300+
12541301
case "response.completed":
12551302
completed := event.AsResponseCompleted()
12561303
responseID = completed.Response.ID

0 commit comments

Comments
 (0)