Skip to content

Latest commit

 

History

History
711 lines (590 loc) · 34.7 KB

File metadata and controls

711 lines (590 loc) · 34.7 KB

Project Cornerstone: One Platform Backend Migration

Moving agent execution from the browser to the One Platform backend, connected over Server-Sent Events.

Architecture

┌─────────────────────────────────────────────────────────────┐
│  CLIENT (visual-editor, browser)                            │
│                                                             │
│   SCA Actions ──→ AgentService.startRun()                   │
│                       │                                     │
│              ┌────────┴────────┐                            │
│              │ AgentRunHandle  │                             │
│              │   .events      │ (AgentEventConsumer)        │
│              │   .abort()     │                             │
│              └────────┬───────┘                             │
│                       │                                     │
│              SSEAgentEventSource                            │
│                       │ fetch + iteratorFromStream           │
│ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┼ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ │
│                       ▼                                     │
│               OPAL_BACKEND_API_PREFIX                       │
│              (appcatalyst.pa.googleapis.com)                │
└─────────────────────────────────────────────────────────────┘
                        │
                ┌───────┴────────────────────────────────────┐
                │  ONE PLATFORM (production backend)          │
                │                                             │
                │  Wraps opal-backend-shared with             │
                │  One Platform API surface                   │
                │                                             │
                │  ← synced from packages/opal-backend-shared
                └─────────────────────────────────────────────┘

  LOCAL DEV:
  ┌─────────────────────────────────────────────────────────┐
  │  packages/opal-backend-dev (Python, FastAPI)            │
  │    - New APIs → wire to opal-backend-shared directly    │
  │    - Existing APIs → proxy to One Platform              │
  └─────────────────────────────────────────────────────────┘

  INTEGRATION TESTING:
  ┌─────────────────────────────────────────────────────────┐
  │  packages/opal-backend-fake (Python, FastAPI)           │
  │    - Canned scenarios, in-memory state                  │
  │    - No real API calls                                  │
  └─────────────────────────────────────────────────────────┘

Packages

Package Language Purpose
opal-backend-shared Python Shared agent logic (synced to prod)
opal-backend-dev Python Dev server (proxy + direct wiring)
opal-backend-fake Python Fake server (canned scenarios)
unified-server TypeScript Static content + blobs (unchanged)
visual-editor TypeScript Client (SSE consumer, unchanged)

Wire Format

21 AgentEvent types defined in agent-event.ts and mirrored as Pydantic models in opal-backend-shared.

Event Direction Purpose
start → client Loop began
thought → client Model reasoning
functionCall → client Tool invocation started
functionCallUpdate → client Tool status update
functionResult → client Tool result
subagentAddJson → client Nested progress
subagentError → client Nested error
subagentFinish → client Nested progress complete
content → client Model output
turnComplete → client Full turn finished
sendRequest → client Gemini request sent
waitForInput ⇄ suspend Needs user text
waitForChoice ⇄ suspend Needs user choice
readGraph ⇄ suspend Read graph data
inspectNode ⇄ suspend Inspect a node
applyEdits ⇄ suspend Confirmed edits
queryConsent ⇄ suspend User consent
graphEdit → client Fire-and-forget edits
complete → client Loop finished
error → client Loop error
finish → client Cleanup signal

Dev Backend Pipeline

How to run end-to-end: npm run dev:backend -w packages/unified-server

This starts both the unified-server (port 3000, serves the frontend) and the Python dev backend (port 8080, runs the agent loop). Open localhost:3000, create or open an Opal, and run it — the full wire protocol will be exercised.

Activation: The frontend enters remote mode when CLIENT_DEPLOYMENT_CONFIG.DEV_BACKEND_MODE is set. This calls agentService.configureRemote(OPAL_BACKEND_API_PREFIX, fetchWithCreds) in packages/visual-editor/src/sca/services/services.ts.

Data flow:

Frontend (browser)                          Dev backend (Python)
─────────────────────                       ────────────────────
1. User runs Opal
2. resolveToSegments(objective, params)
   → segments[] + flags
3. SSEAgentRun POSTs to /api/agent/run
   body: {kind, segments, flags}
                                    ──→  4. to_pidgin(segments, file_system)
                                            → pidgin text + capabilities
                                         5. Wrap: <objective>text</objective>
                                         6. Loop.run(objective) → Gemini
                                    ←──  7. SSE events stream back
8. AgentEventConsumer dispatches
   events to SCA controllers

Key files:

  • packages/visual-editor/src/a2/agent/resolve-to-segments.ts — template → segments
  • packages/visual-editor/src/a2/agent/sse-agent-run.ts — POST body construction
  • packages/opal-backend-dev/opal_backend_dev/main.py — receives and processes body
  • packages/opal-backend-shared/opal_backend_shared/pidgin.pyto_pidgin + from_pidgin_string

Phases

How Objectives Work

Objectives (🎯) are the real milestones — concrete, executable tests that prove the system works. They go at the top of each phase. Everything below them is in service of reaching them.

Plan backward from the objective. Write the objective first as a specific action with an observable result ("run this command, see this output"). Then work backward: what items are needed to make that action succeed? If the items don't add up to a reachable objective, the plan is wrong — restructure it.

A checked-off list is not an objective. Individual items can be correct (code compiles, tests pass) without adding up to the objective. Before marking a phase complete, trace the full path from the user's action to the expected result. If any link is missing, the objective is not reached — add the missing work to the plan rather than redefining the objective.

Restructuring is progress. Discovering that the plan doesn't reach the objective is valuable information. Add new phases, split existing ones, move items — whatever makes the path to the objective honest.

Phase 1: Client-Side Event Coverage ✅

  • Event types, sink, consumer, bridge
  • buildHooksFromSink
  • AgentService + AgentRunHandle
  • Strangler-fig GraphEditingAgentService → Actions

Phase 2: Content Generation Agent ✅

  • Content generation agent uses AgentService
  • ConsoleProgressManager + RunStateManager as consumer handlers
  • Subagent reporter events + proxy ProgressReporter
  • FunctionCallEvent carries args for custom work item titles

Phase 3: Suspend/Resume via Events ✅

  • waitForInput / waitForChoice suspend events
  • Consumer handlers: requestInput(), ChoicePresenter
  • Graph-editing agent: sink.suspend() in chat-functions

Phase 3.5: Generalize Client Calls ✅

  • SuspendEvent union, widened suspend() signature
  • readGraph, inspectNode, applyEdits, queryConsent suspend events
  • Consumer handlers for each new suspend event

Phase 3.75: Client-Side SSE Transport ✅

  • SSEAgentEventSourcefetch + iteratorFromStream
  • SSEAgentRun / LocalAgentRun — split run implementations
  • AgentService.configureRemote(baseUrl, fetchFn)

Phase 4: Python Backend Packages

4.1: Scaffolding ✅

  • packages/opal-backend-shared/ — protocol primitives (events, sink, pending requests)
  • packages/opal-backend-fake/ — canned scenarios + FastAPI endpoints (absorbed mock-agent-server)
  • packages/opal-backend-dev/ — stub with proxy for existing APIs
  • Remove packages/mock-agent-server/
  • Migrate and verify all existing tests (13/13 passing)

4.2: Local Dev Workflow ✅

  • Root npm run setup:python (creates venvs for all Python packages)
  • PIP_INDEX_URL baked into all setup scripts
  • dev:fake starts fake Python backend alongside static server (with venv check)
  • BACKEND_API_ENDPOINT=http://localhost:8000 set in serve:fake env
  • Developer docs in opal-backend-dev/README.md

4.3: Proxy-First Backend ✅

  • opal_backend_shared/local/ — local-only shared API surface
  • api_surface.py — router factory with AgentBackend + ProxyBackend protocols
  • opal-backend-dev reverse proxy via httpx (forwards auth headers)
  • opal-backend-fake refactored to shared API surface (13 tests passing)
  • dev:backend wireit entry (serves at :3000, proxy at :8080)
  • start-dev-backend.sh with venv check

4.4: Port Agent Loop to Python

4.4a: Loop Core ✅
  • gemini_client.py — streaming Gemini API via httpx
  • function_definition.pyFunctionDefinition, FunctionGroup types
  • function_caller.py — async function dispatch + result collection
  • loop.py — while-loop orchestrator with LoopHooks
  • Unit tests with mocked Gemini responses (14 tests)
4.4b: Termination Functions ✅
  • Port system_objective_fulfilled (terminates loop with success)
  • Port system_failed_to_fulfill_objective (terminates loop with failure)
  • System instruction (meta-plan prompt — verbatim port)
  • Unit tests (13 tests)
4.4c: DevAgentBackend + End-to-End ✅
  • AgentEventSink + build_hooks_from_sink in opal-backend-shared
  • DevAgentBackend in opal-backend-dev (implements AgentBackend)
  • Agent endpoint wiring (always active, access token from request headers)
  • Unit tests (15 tests for event sink + hooks)
4.4d: Resumable Stream Protocol ✅

🎯 Objective: Backend pipeline works end-to-end without the frontend.

curl -X POST http://localhost:8080/api/agent/run \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $(gcloud auth print-access-token)" \
  -d '{"kind":"content","objective":{"parts":[{"text":"Make a joke"}],"role":"user"}}' \
  --no-buffer

→ SSE stream of events → ends with system_objective_fulfilled.

Architecture pivot: Single POST /api/agent/run → SSE stream replaces the multi-endpoint pattern.

POST /api/agent/run  →  SSE stream (start or resume)
Body (start):  {kind, objective}
Body (resume): {interactionId, response}
  • Redesign api_surface.py — single POST /runEventSourceResponse
  • Update DevAgentBackend — POST handler starts loop, streams inline
  • Update frontend SSEAgentEventSource — POST with body instead of GET
  • Update frontend AgentService — pass config into SSEAgentRun
  • Remove SSEAgentRun.resolveInput() side-channel
  • Auth: access token from Authorization header → Loop
  • Wire configureRemote() in app init via BACKEND_API_ENDPOINT
  • Fix proxy Content-Encoding header stripping
4.4e: Wire Content Runs Through AgentService ✅

🎯 Objective: Run an opal through the dev backend and see the result.

npm run dev:backend -w packages/unified-server

Open the app, run an opal with a simple text task like "make a joke". The agent calls Gemini via the Python backend, streams events back, and the result appears in the UI.

  • DEV_BACKEND_MODE deploy-time flag (like FAKE_MODE)
  • configureRemote() gated on flag in services.ts
  • Early instanceof SSEAgentRun branch in main.tsinvokeRemoteAgent()
  • Lightweight ConsoleProgressManager for remote UI reporting
  • complete event carries AgentResult.outcomes (don't break on finish)
  • Tests: SSE event sequence, outcome extraction, error handling (11 tests)
4.4f: Agent File System + System Functions
  • Port AgentFileSystem (in-memory virtual FS)
  • Port TaskTreeManager (task tree schema + status tracking)
  • Port PidginTranslator.fromPidginString (resolve <file> tags → data parts from FS — needed by system_write_file and onSuccess callback)
  • Port remaining system functions (list/read/write files, task tree)
  • Add intermediate / FileData to AgentResult
  • Wire file system + task tree into loop setup

4.5: Wire Protocol + Objective Handling

Design: Structured segments, not raw LLMContent. The client sends semantic intent; the server owns the entire pidgin vocabulary.

toPidgin splits in two: the client resolves templates into typed segments (text, asset, input). The server walks segments, registers data parts in AgentFileSystem, and emits all pidgin tags. pidgin.py is the single source of truth for the pidgin language.

Capabilities (useMemory, useNotebookLM) are discovered by the client during template resolution — they emerge from encountering template chips, not from runtime flags. Custom tools run on the server: the client sends board URLs, the server loads and invokes them.

  • Define segment types: text (literal), asset (titled content group), input (agent-output content group), tool (routes, memory, NLM, custom)
  • Client-side pre-resolution: resolve-to-segments.ts extracts template resolution from toPidgin into a step that runs before startRun()
  • Wire metadata: flags.useNotebookLM sideband; useMemory, routes, and custom tools discovered from tool segments server-side
  • Server-side to_pidgin(segments): walk segments, register data parts in FS, emit <asset>, <input>, <file>, <content>, <objective> tags
  • Server-side onSuccess callback: from_pidgin_string (done ✅) + intermediate file collection (done ✅)
  • End-to-end: npm run dev:backend → frontend resolves segments → POST → to_pidgin → Loop → SSE stream back to browser

4.6: Data Transform Plumbing

The shared substrate: resolving storedData/fileData references to Gemini-consumable formats. All media generation and content functions depend on this.

Key insight: the D2F (Drive → Gemini File) and B2F (Blob → Gemini File) transforms already go through backend endpoints (/v1beta1/uploadGeminiFile, /v1beta1/uploadBlobFile). The dev backend proxies to the same One Platform server. The Python agent loop can call these endpoints directly.

  • conform_body on the server: conform_body.py walks LLMContent parts, resolves storedData/fileData to Gemini File API URLs via /v1beta1/uploadGeminiFile (HTTP calls to One Platform)
  • json parts → {text: json.dumps()} (inline transform)
  • NotebookLM storedData → {text: url} passthrough
  • _upload_gemini_file helper: authenticated POST to One Platform, BackendClient protocol injected via Loop.__init__
  • Tests: 21 tests covering all 6 transforms, error handling, mixed content

4.7: Function Groups

With the wire protocol and data plumbing in place, function groups are thin handlers on top.

4.7a: Text Generation

🎯 Objective: Send an image into the agent and ask it to describe it. The image flows through segments → pidgin → from_pidgin_stringconform_body (resolves storedData to Gemini File API) → generate_text → text description streams back over SSE. Full multimodal pipeline end-to-end.

  • Port generate_text function (pidgin → conformBody → streamContent → merge text)
  • Grounding tools: Google Search, Google Maps, URL context
  • Wire get_generate_function_group into dev backend main.py
  • Tests: 20 tests — handler, grounding, model resolution, error handling
4.7b: Image Generation

🎯 Objective: Ask the agent to "generate an image of a cat" through the dev backend. The agent calls generate_imageexecuteStep with ai_image_tool → One Platform returns inline image data → saved to agent FS.

  • Port executeStep client (POST to /v1beta1/executeStep, collect output chunks)
  • Port generate_image function (prompt + optional input images + aspect ratio → executeStep → save to FS)
  • Wire into dev backend main.py
  • Tests
4.7c: Video Generation

🎯 Objective: Ask the agent to "make a short video of waves crashing." The agent calls generate_videoexecuteStep with Veo model → storedData part saved to agent FS.

  • Port generate_video function (prompt + optional reference images → executeStep with generate_video API + Veo model selection)
  • Port expandVeoError safety-code mapping
  • Wire into dev backend main.py
  • Tests
4.7d: Speech Generation

🎯 Objective: Ask the agent to "read this paragraph aloud." The agent calls generate_speech_from_textexecuteStep → audio storedData saved to agent FS.

  • Port generate_speech_from_text function (text + voice selection → executeStep → save audio to FS)
  • Wire into dev backend main.py
  • Tests
4.7e: Music Generation

🎯 Objective: Ask the agent to "compose upbeat background music." The agent calls generate_music_from_textexecuteStep → audio storedData saved to agent FS.

  • Port generate_music_from_text function (prompt → executeStep → save audio to FS)
  • Wire into dev backend main.py
  • Tests
4.7f: Code Generation & Execution

🎯 Objective: Ask the agent to "calculate the first 20 Fibonacci numbers." The agent calls generate_and_execute_codeconformBodystreamContent with codeExecution tool → streams text + inline file results back.

  • Port generate_and_execute_code function (prompt → conformBodystreamContent with code execution tool → merge text + file results)
  • Wire into dev backend main.py
  • Tests

4.8: Suspend/Resume for Interactive Agents

🎯 Objective: Open graph editor, use AI chat to edit a graph through the dev backend. Each interaction round-trips as: POST → stream → suspend → POST → stream → complete.

Design decision: reconnect, not keepalive. The SSE stream closes when the loop suspends. The client POSTs again with {interactionId, response} to resume. This is the only viable approach — suspends can last seconds, hours, or days. A keepalive stream cannot stay open that long, and the production backend is stateless. The dev backend must match this model.

4.8a: Suspend/Resume Protocol ✅
  • SuspendError + SuspendResult in opal-backend-shared
  • InteractionStore — in-memory state store keyed by interactionId
  • Loop catches SuspendError, returns SuspendResult
  • Dev backend: _start() / _resume() / _stream_loop() with state save/load
  • Client SSEAgentEventSource: reconnect loop — suspend → await handler → POST resume → new stream
  • Tests: 11 Python (suspend/resume round-trip) + TS reconnect test
4.8b: Chat Suspend Functions
  • Port chat_request_user_input + chat_present_choices to Python (raises SuspendError with waitForInput / waitForChoice)
  • waitForInput handler in invokeRemoteAgent — prompt display + input
  • Loop: suppress on_finish on suspend, suppress on_start on resume
  • End-to-end: Generate step (Agent mode, prompt: "ask user to provide their name") through dev:backend — suspend → user types → resume → complete
4.8c: Unified Suspend Handler Rendering ✅

🎯 Objective: Both waitForInput and waitForChoice work identically through local and remote paths. The rendering logic (prompt display, input collection, choice presentation) is shared — not duplicated between AgentUI and invokeRemoteAgent.

  • Extract #addChatOutput pattern into a shared utility (chat-output.ts) usable by both local (AgentUI) and remote (invokeRemoteAgent) paths
  • waitForInput remote handler uses the shared utility (deduplicate)
  • waitForChoice remote handler with ChoicePresenter + A2UIInteraction (proper choice buttons / checkboxes)
  • A2UIInteraction extracted as shared rendering core — AgentUI delegates to it, adding only pidgin translation on top
  • Server-side pidgin resolution for choice labels (from_pidgin_string in chat.py) — images and file refs render correctly in choices
  • End-to-end: choice-based interaction through dev:backend
  • Port fidelity audit — all function groups (image, video, audio, generate, chat, system) verified against TS source; descriptions restored; shared_schemas.py centralizes statusUpdateSchema, taskIdSchema, fileNameSchema

Phase 5: Python Consolidation & Copybara Prep

🎯 Objective: opal-backend-shared code is pure Python (no httpx, fastapi, or pydantic imports in synced files). The shared code can be copybara'd to third_party/py/opal_backend and the production backend can inject its own HTTP transport.

# Verify: no transport deps in synced code
grep -r "import httpx\|from fastapi\|from sse_starlette" \
  packages/opal-backend/opal_backend/*.py \
  packages/opal-backend/opal_backend/functions/
# → no results

5.1: Package Consolidation ✅

  • Merge opal-backend-shared, opal-backend-dev, opal-backend-fake into single packages/opal-backend/ with one pyproject.toml and one .venv
  • Directory structure: opal_backend/ (synced), opal_backend/local/ (not synced), opal_backend/dev/, opal_backend/fake/
  • Move fake-server artifacts (events.py, sse_sink.py, pending_requests.py) to local/ (not synced to google3)
  • Update all imports across source and test files
  • Update wireit scripts (setup:python, test:python, dev:backend, dev:fake)

5.2: HTTP Transport Abstraction ✅

  • http_client.pyHttpClient protocol (synced, no deps)
  • local/http_client_impl.pyhttpx-based implementation (not synced)
  • Update gemini_client.py — replace import httpx with HttpClient
  • Update conform_body.py — replace import httpx with HttpClient
  • Update step_executor.py — replace import httpx with HttpClient
  • Thread HttpClient through Loop and function group factories
  • Update all tests to inject HttpClient

5.3: Typed Event Models ✅

  • opal_backend/events.py — dataclass models for all 22 AgentEvent types, AgentResult, FileData, request/response bodies (StartRunRequest, ResumeRunRequest), segment types — all with to_dict() producing camelCase JSON (no pydantic dependency)
  • agent_events.pyAgentEventSink queue and build_hooks_from_sink emit typed events instead of dict[str, Any]
  • suspend.pySuspendError takes typed SuspendEvent + explicit function_call_part parameter (moved out of event dict)
  • loop.py — imports wire-format types from events.py
  • functions/chat.py — constructs WaitForInputEvent/WaitForChoiceEvent
  • dev/main.py — SSE serialization via event.to_dict() + json.dumps()
  • local/sse_sink.py — uses to_dict() instead of Pydantic model_dump_json()
  • Delete local/events.py (replaced by synced events.py)
  • Update all tests (test_agent_events.py, test_chat_functions.py, test_suspend_resume.py)

5.4: BackendClient Protocol ✅

  • backend_client.pyBackendClient protocol (execute_step, upload_gemini_file, upload_blob_file) + HttpBackendClient impl
  • step_executor.py — delegates to BackendClient, removed HTTP logic
  • conform_body.py — delegates to BackendClient.upload_gemini_file
  • loop.pyupstream_base: strbackend: BackendClient | None
  • All function groups (image, audio, video, generate) updated
  • dev/main.py — per-request HttpBackendClient with origin
  • Tests updated (step_executor, conform_body, suspend_resume)

5.5: High-Level Entry Point ✅

5.5a: Credential Internalization ✅

🎯 Objective: access_token and origin are no longer parameters of Loop, function group factories, or conform_body. Credentials are a transport concern — baked into HttpClient and BackendClient.

  • HttpClient protocol — access_token property
  • HttpxClient — accepts and exposes access_token
  • GeminiClient — reads client.access_token
  • BackendClient protocol — removed access_token from all methods; HttpBackendClient reads from self._client.access_token
  • conform_body — removed access_token param
  • step_executor — removed access_token from functions
  • All 4 function group factories — removed access_token param
  • Loop.__init__ — removed access_token and origin
  • dev/main.py — per-request HttpxClient with baked-in token; removed access_token/origin from InteractionState
  • Tests updated (all 322 pass)
5.5b: run() / resume() Entry Point ✅

🎯 Objective: opal_backend.run() is a single async iterator that takes an objective + injected deps and yields AgentEvents. Consumers provide only what varies by environment. Everything else is internal.

async for event in opal_backend.run(
    objective=objective,
    client=http_client,
    backend=backend_client,
    store=interaction_store,
):
    yield event
  • opal_backend/run.pyrun() async generator: creates AgentFileSystem, TaskTreeManager, AgentEventSink, LoopController, builds function groups internally, runs loop, dispatches CompleteEvent/ErrorEvent, closes sink. Accepts extensible flags dict.
  • opal_backend/run.pyresume() async generator: loads state from InteractionStore, injects function response, rebuilds function groups, runs loop, dispatches events
  • dev/main.py — delegates to opal_backend.run() / resume(); deleted _build_function_groups, _stream_loop (~130 lines removed)
  • opal_backend/__init__.py — re-export run, resume
  • Tests: test_run.py — 6 tests (complete, error, suspend, failed, unknown ID, full suspend→resume round-trip). All 328 pass.
5.5c: Protocol Boundary Extraction ✅

🎯 Objective: Core modules depend only on protocols. All transport-specific implementations live in local/.

  • backend_client.py — protocol-only (removed HttpBackendClient, HttpClient import, logging)
  • local/backend_client_impl.pyHttpBackendClient moved here
  • interaction_store.py — converted InteractionStore class to Protocol
  • local/interaction_store_impl.pyInMemoryInteractionStore moved here
  • Updated imports: dev/main.py, test_step_executor.py, test_suspend_resume.py, test_run.py (328 pass)

5.6: Memory & Storage ✅

🎯 Objective: Agents running on the Python backend can use persistent Google Sheets-backed memory. Chat conversations are persisted and recalled across sessions. The graph identity is required and anchors all storage.

  • DriveOperationsClient protocol (9 methods: Drive CRUD + Sheets read/write/batch) + HttpDriveOperationsClient impl (#8089)
  • InteractionStore protocol made async (save, load, has, clear); InMemoryInteractionStore updated; flags/graph folded into stored interaction state (#8089)
  • AgentFileSystem made async (get, list_files, read_text, get_many, from_pidgin_string) — prep for database-backed storage (#8090)
  • SheetManager — ported from TS SheetManager + memorySheetGetter; resolves spreadsheet IDs via DriveOperationsClient.query_files, creates / reads / updates / deletes sheets. 21 tests. (#8091)
  • 5 memory functions (create_sheet, read_sheet, update_sheet, delete_sheet, get_metadata) ported to Python with JSON schemas matching TS exactly. 12 tests. Wired into run(). (#8091)
  • AgentFileSystem/mnt/memory/ path resolution via set_sheet_manager(). 4 tests. (#8091)
  • ChatLogManager — seeds from existing sheets, persists new entries, registers chat log as system file. Wired into run() and resume(). (#8092)
  • graph promoted to required sibling field (out of flags) across full stack; every run anchored to graph identity (#8092)
  • session_id in InteractionState for cross-suspend chat log continuity (#8092)
  • _process_chat_response — transforms {input: LLMContent}{user_input: text} on resume, matching TS handler logic (#8092)
  • .on("error") handler in invokeRemoteAgent — server errors now surface to users (#8092)
  • Tests: 435 Python + 31 TS agent tests pass

5.7: Port Fidelity & Error Handling ✅

Systematic audit-and-fix pass across all function groups to ensure the Python backend matches TS behavior exactly. Also adds fatal error classification so unrecoverable errors (quota-exhausted) terminate the loop immediately instead of being retried by the LLM.

  • error_classifier.pyto_error_or_response() classifies errors by parsing structured JSON (RESOURCE_EXHAUSTED) or fuzzy keyword matching; promotes quota-exhausted to $error outcomes. 16 tests. (#8071)
  • function_caller.py + loop.py — propagate and handle $error from function results (#8071)
  • All generation handlers (video, image, audio, generate) route errors through to_error_or_response (#8071)
  • generate.py port fidelity — mergeTextParts utility, statusUpdater TODOs, model resolution alignment. 20 tests. (#8070, #8081, #8087)
  • image.py — batch resolution via file_system.get_many(), output- processing drift fix (get_many + 5 tests) (#8082, #8088)
  • chat.pycomputeFormat (inputType → icon name mapping) ported; from_pidgin_string resolution for choice labels (#8095)
  • system.py — 3 drift fixes: route resolution via get_original_route(href), intermediate file LLMContent format, href in failed_to_fulfill. 5 tests. (#8096)
  • BackendClient.stream_generate_content — consolidated all Gemini streaming into BackendClient, eliminated HttpClient protocol entirely (#8074, #8075)
  • ENABLE_GEMINI_BACKEND flag — routes Gemini calls through backend (#8076)
  • Tests: 441 Python tests pass

Phase 6: Production Readiness ← we are here

Full parity with the in-process agent. Everything that works locally works identically through the dev backend.

  • Status metadata plumbing — StatusUpdateOptions (expectedDurationInSec, isThought) flows through StatusUpdateCallbackFunctionCallUpdateEvent.opts → SSE wire format. All 4 TODOs in generate.py resolved.
  • url_context consent flow — FunctionDefinition gains an optional precondition handler, run by FunctionCaller before the main handler. The consent precondition raises SuspendError(QueryConsentEvent) with is_precondition_check=True. On resume, _resume_precondition records the grant and re-dispatches the function call — the model never sees the consent round-trip.
  • Cancel concurrent function caller tasks on suspend — loop.py now cancels sibling asyncio.Tasks before saving state, eliminating the "emit() called on closed sink" warning.
  • content event — investigation result: only consumed in local mode (RunStateManager.pushContent), ignored by remote client. Removed server-side emission from build_hooks_from_sink and loop. The event can be fully removed if LocalAgentRun is removed.
  • Segment data-parts transfer — moved to_pidgin() inside run() so segments are converted using the loop's own AgentFileSystem. Data parts from asset/input segments now survive into the loop. run() accepts segments (not objective) as its primary input; dev/main.py passes segments straight through.
  • Chat resume multimodal parts — _process_chat_response uses content_to_pidgin_string to register binary parts (images, file uploads) in AgentFileSystem and produce pidgin text with <file> tags, matching the TS local path behavior.