Skip to content

fix: preserve conversation context when provider fallback activates#13235

Open
rfpassos wants to merge 1 commit intoNousResearch:mainfrom
rfpassos:fix/fallback-preserve-context
Open

fix: preserve conversation context when provider fallback activates#13235
rfpassos wants to merge 1 commit intoNousResearch:mainfrom
rfpassos:fix/fallback-preserve-context

Conversation

@rfpassos
Copy link
Copy Markdown

Summary

  • rebuild api_messages on every retry attempt instead of reusing a payload prepared for the previous provider
  • prevent provider-specific transforms (for example Anthropic/OpenRouter prompt caching wrappers) from leaking into fallback requests
  • add regression coverage for fallback from codex/chat paths and prompt-cached Claude requests
  • make fallback tests self-contained so they do not depend on local provider config

Root cause

When the primary model failed, Hermes switched provider / model / api_mode in-place, but continued using an api_messages payload that had already been prepared for the previous backend.

That meant fallback requests could inherit backend-specific payload mutations such as:

  • Anthropic/OpenRouter cache_control wrappers
  • provider-specific tool-call sanitization state
  • message formatting intended for the old API mode

In practice this could look like context loss or broken continuity after fallback, because the new backend was not receiving a fresh request built from the canonical conversation state.

Fix

  • extracted per-attempt payload construction into AIAgent._build_api_messages_for_attempt(...)
  • moved API message reconstruction inside the retry loop
  • ensured each retry/fallback attempt recalculates request size from the rebuilt payload
  • preserved newer upstream behavior while resolving the cherry-pick conflict (/steer pre-API drain, cache layout selection, tool-argument repair)

Tests

Added regression coverage in:

  • tests/run_agent/test_fallback_context_preservation.py

Validated with:

  • pytest tests/run_agent/test_fallback_context_preservation.py -q
  • pytest tests/run_agent/test_fallback_model.py -q
  • pytest tests/run_agent/test_run_agent_codex_responses.py -q

Why this matters

Fallback should always start from the canonical conversation history, not from a payload already transformed for a different provider. This keeps tool state, prompt context, and conversation continuity intact when Hermes has to fail over mid-turn.

- rebuild api_messages inside each retry attempt
- prevent provider-specific payload transforms from leaking into fallback
- add regression tests for codex/chat and prompt-cache fallback cases
- make fallback model tests self-contained
@alt-glitch alt-glitch added type/bug Something isn't working P1 High — major feature broken, no workaround comp/agent Core agent loop, run_agent.py, prompt builder labels Apr 22, 2026
@alt-glitch
Copy link
Copy Markdown
Collaborator

Likely duplicate of PR #13654 — same root cause: fallback reuses provider-specific payload instead of rebuilding from canonical conversation state.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp/agent Core agent loop, run_agent.py, prompt builder P1 High — major feature broken, no workaround type/bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants