Moving agent execution from the browser to the One Platform backend, connected over Server-Sent Events.
┌─────────────────────────────────────────────────────────────┐
│ CLIENT (visual-editor, browser) │
│ │
│ SCA Actions ──→ AgentService.startRun() │
│ │ │
│ ┌────────┴────────┐ │
│ │ AgentRunHandle │ │
│ │ .events │ (AgentEventConsumer) │
│ │ .abort() │ │
│ └────────┬───────┘ │
│ │ │
│ SSEAgentEventSource │
│ │ fetch + iteratorFromStream │
│ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┼ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ │
│ ▼ │
│ OPAL_BACKEND_API_PREFIX │
│ (appcatalyst.pa.googleapis.com) │
└─────────────────────────────────────────────────────────────┘
│
┌───────┴────────────────────────────────────┐
│ ONE PLATFORM (production backend) │
│ │
│ Wraps opal-backend-shared with │
│ One Platform API surface │
│ │
│ ← synced from packages/opal-backend-shared
└─────────────────────────────────────────────┘
LOCAL DEV:
┌─────────────────────────────────────────────────────────┐
│ packages/opal-backend-dev (Python, FastAPI) │
│ - New APIs → wire to opal-backend-shared directly │
│ - Existing APIs → proxy to One Platform │
└─────────────────────────────────────────────────────────┘
INTEGRATION TESTING:
┌─────────────────────────────────────────────────────────┐
│ packages/opal-backend-fake (Python, FastAPI) │
│ - Canned scenarios, in-memory state │
│ - No real API calls │
└─────────────────────────────────────────────────────────┘
| Package | Language | Purpose |
|---|---|---|
opal-backend-shared |
Python | Shared agent logic (synced to prod) |
opal-backend-dev |
Python | Dev server (proxy + direct wiring) |
opal-backend-fake |
Python | Fake server (canned scenarios) |
unified-server |
TypeScript | Static content + blobs (unchanged) |
visual-editor |
TypeScript | Client (SSE consumer, unchanged) |
21 AgentEvent types defined in
agent-event.ts and
mirrored as Pydantic models in opal-backend-shared.
| Event | Direction | Purpose |
|---|---|---|
start |
→ client | Loop began |
thought |
→ client | Model reasoning |
functionCall |
→ client | Tool invocation started |
functionCallUpdate |
→ client | Tool status update |
functionResult |
→ client | Tool result |
subagentAddJson |
→ client | Nested progress |
subagentError |
→ client | Nested error |
subagentFinish |
→ client | Nested progress complete |
content |
→ client | Model output |
turnComplete |
→ client | Full turn finished |
sendRequest |
→ client | Gemini request sent |
waitForInput |
⇄ suspend | Needs user text |
waitForChoice |
⇄ suspend | Needs user choice |
readGraph |
⇄ suspend | Read graph data |
inspectNode |
⇄ suspend | Inspect a node |
applyEdits |
⇄ suspend | Confirmed edits |
queryConsent |
⇄ suspend | User consent |
graphEdit |
→ client | Fire-and-forget edits |
complete |
→ client | Loop finished |
error |
→ client | Loop error |
finish |
→ client | Cleanup signal |
How to run end-to-end:
npm run dev:backend -w packages/unified-serverThis starts both the unified-server (port 3000, serves the frontend) and the Python dev backend (port 8080, runs the agent loop). Open
localhost:3000, create or open an Opal, and run it — the full wire protocol will be exercised.
Activation: The frontend enters remote mode when
CLIENT_DEPLOYMENT_CONFIG.DEV_BACKEND_MODE is set. This calls
agentService.configureRemote(OPAL_BACKEND_API_PREFIX, fetchWithCreds) in
packages/visual-editor/src/sca/services/services.ts.
Data flow:
Frontend (browser) Dev backend (Python)
───────────────────── ────────────────────
1. User runs Opal
2. resolveToSegments(objective, params)
→ segments[] + flags
3. SSEAgentRun POSTs to /api/agent/run
body: {kind, segments, flags}
──→ 4. to_pidgin(segments, file_system)
→ pidgin text + capabilities
5. Wrap: <objective>text</objective>
6. Loop.run(objective) → Gemini
←── 7. SSE events stream back
8. AgentEventConsumer dispatches
events to SCA controllers
Key files:
packages/visual-editor/src/a2/agent/resolve-to-segments.ts— template → segmentspackages/visual-editor/src/a2/agent/sse-agent-run.ts— POST body constructionpackages/opal-backend-dev/opal_backend_dev/main.py— receives and processes bodypackages/opal-backend-shared/opal_backend_shared/pidgin.py—to_pidgin+from_pidgin_string
Objectives (🎯) are the real milestones — concrete, executable tests that prove the system works. They go at the top of each phase. Everything below them is in service of reaching them.
Plan backward from the objective. Write the objective first as a specific action with an observable result ("run this command, see this output"). Then work backward: what items are needed to make that action succeed? If the items don't add up to a reachable objective, the plan is wrong — restructure it.
A checked-off list is not an objective. Individual items can be correct (code compiles, tests pass) without adding up to the objective. Before marking a phase complete, trace the full path from the user's action to the expected result. If any link is missing, the objective is not reached — add the missing work to the plan rather than redefining the objective.
Restructuring is progress. Discovering that the plan doesn't reach the objective is valuable information. Add new phases, split existing ones, move items — whatever makes the path to the objective honest.
- Event types, sink, consumer, bridge
-
buildHooksFromSink -
AgentService+AgentRunHandle - Strangler-fig
GraphEditingAgentService→ Actions
- Content generation agent uses
AgentService -
ConsoleProgressManager+RunStateManageras consumer handlers - Subagent reporter events + proxy
ProgressReporter -
FunctionCallEventcarriesargsfor custom work item titles
-
waitForInput/waitForChoicesuspend events - Consumer handlers:
requestInput(),ChoicePresenter - Graph-editing agent:
sink.suspend()in chat-functions
-
SuspendEventunion, widenedsuspend()signature -
readGraph,inspectNode,applyEdits,queryConsentsuspend events - Consumer handlers for each new suspend event
-
SSEAgentEventSource—fetch+iteratorFromStream -
SSEAgentRun/LocalAgentRun— split run implementations -
AgentService.configureRemote(baseUrl, fetchFn)
-
packages/opal-backend-shared/— protocol primitives (events, sink, pending requests) -
packages/opal-backend-fake/— canned scenarios + FastAPI endpoints (absorbed mock-agent-server) -
packages/opal-backend-dev/— stub with proxy for existing APIs - Remove
packages/mock-agent-server/ - Migrate and verify all existing tests (13/13 passing)
- Root
npm run setup:python(creates venvs for all Python packages) -
PIP_INDEX_URLbaked into all setup scripts -
dev:fakestarts fake Python backend alongside static server (with venv check) -
BACKEND_API_ENDPOINT=http://localhost:8000set inserve:fakeenv - Developer docs in
opal-backend-dev/README.md
-
opal_backend_shared/local/— local-only shared API surface -
api_surface.py— router factory withAgentBackend+ProxyBackendprotocols -
opal-backend-devreverse proxy viahttpx(forwards auth headers) -
opal-backend-fakerefactored to shared API surface (13 tests passing) -
dev:backendwireit entry (serves at:3000, proxy at:8080) -
start-dev-backend.shwith venv check
-
gemini_client.py— streaming Gemini API viahttpx -
function_definition.py—FunctionDefinition,FunctionGrouptypes -
function_caller.py— async function dispatch + result collection -
loop.py— while-loop orchestrator withLoopHooks - Unit tests with mocked Gemini responses (14 tests)
- Port
system_objective_fulfilled(terminates loop with success) - Port
system_failed_to_fulfill_objective(terminates loop with failure) - System instruction (meta-plan prompt — verbatim port)
- Unit tests (13 tests)
-
AgentEventSink+build_hooks_from_sinkinopal-backend-shared -
DevAgentBackendinopal-backend-dev(implementsAgentBackend) - Agent endpoint wiring (always active, access token from request headers)
- Unit tests (15 tests for event sink + hooks)
🎯 Objective: Backend pipeline works end-to-end without the frontend.
curl -X POST http://localhost:8080/api/agent/run \ -H "Content-Type: application/json" \ -H "Authorization: Bearer $(gcloud auth print-access-token)" \ -d '{"kind":"content","objective":{"parts":[{"text":"Make a joke"}],"role":"user"}}' \ --no-buffer→ SSE stream of events → ends with
system_objective_fulfilled.
Architecture pivot: Single POST /api/agent/run → SSE stream replaces the
multi-endpoint pattern.
POST /api/agent/run → SSE stream (start or resume)
Body (start): {kind, objective}
Body (resume): {interactionId, response}
- Redesign
api_surface.py— singlePOST /run→EventSourceResponse - Update
DevAgentBackend— POST handler starts loop, streams inline - Update frontend
SSEAgentEventSource— POST with body instead of GET - Update frontend
AgentService— pass config into SSEAgentRun - Remove
SSEAgentRun.resolveInput()side-channel - Auth: access token from
Authorizationheader →Loop - Wire
configureRemote()in app init viaBACKEND_API_ENDPOINT - Fix proxy
Content-Encodingheader stripping
🎯 Objective: Run an opal through the dev backend and see the result.
npm run dev:backend -w packages/unified-serverOpen the app, run an opal with a simple text task like "make a joke". The agent calls Gemini via the Python backend, streams events back, and the result appears in the UI.
-
DEV_BACKEND_MODEdeploy-time flag (likeFAKE_MODE) -
configureRemote()gated on flag inservices.ts - Early
instanceof SSEAgentRunbranch inmain.ts→invokeRemoteAgent() - Lightweight
ConsoleProgressManagerfor remote UI reporting -
completeevent carriesAgentResult.outcomes(don't break onfinish) - Tests: SSE event sequence, outcome extraction, error handling (11 tests)
- Port
AgentFileSystem(in-memory virtual FS) - Port
TaskTreeManager(task tree schema + status tracking) - Port
PidginTranslator.fromPidginString(resolve<file>tags → data parts from FS — needed bysystem_write_fileandonSuccesscallback) - Port remaining system functions (list/read/write files, task tree)
- Add
intermediate/FileDatatoAgentResult - Wire file system + task tree into loop setup
Design: Structured segments, not raw
LLMContent. The client sends semantic intent; the server owns the entire pidgin vocabulary.
toPidginsplits in two: the client resolves templates into typed segments (text,asset,input). The server walks segments, registers data parts inAgentFileSystem, and emits all pidgin tags.pidgin.pyis the single source of truth for the pidgin language.Capabilities (
useMemory,useNotebookLM) are discovered by the client during template resolution — they emerge from encountering template chips, not from runtime flags. Custom tools run on the server: the client sends board URLs, the server loads and invokes them.
- Define segment types:
text(literal),asset(titled content group),input(agent-output content group),tool(routes, memory, NLM, custom) - Client-side pre-resolution:
resolve-to-segments.tsextracts template resolution fromtoPidgininto a step that runs beforestartRun() - Wire metadata:
flags.useNotebookLMsideband;useMemory, routes, and custom tools discovered fromtoolsegments server-side - Server-side
to_pidgin(segments): walk segments, register data parts in FS, emit<asset>,<input>,<file>,<content>,<objective>tags - Server-side
onSuccesscallback:from_pidgin_string(done ✅) + intermediate file collection (done ✅) - End-to-end:
npm run dev:backend→ frontend resolves segments → POST →to_pidgin→ Loop → SSE stream back to browser
The shared substrate: resolving
storedData/fileDatareferences to Gemini-consumable formats. All media generation and content functions depend on this.Key insight: the D2F (Drive → Gemini File) and B2F (Blob → Gemini File) transforms already go through backend endpoints (
/v1beta1/uploadGeminiFile,/v1beta1/uploadBlobFile). The dev backend proxies to the same One Platform server. The Python agent loop can call these endpoints directly.
-
conform_bodyon the server:conform_body.pywalksLLMContentparts, resolvesstoredData/fileDatato Gemini File API URLs via/v1beta1/uploadGeminiFile(HTTP calls to One Platform) -
jsonparts →{text: json.dumps()}(inline transform) -
NotebookLMstoredData →{text: url}passthrough -
_upload_gemini_filehelper: authenticated POST to One Platform,BackendClientprotocol injected viaLoop.__init__ - Tests: 21 tests covering all 6 transforms, error handling, mixed content
With the wire protocol and data plumbing in place, function groups are thin handlers on top.
🎯 Objective: Send an image into the agent and ask it to describe it. The image flows through segments → pidgin →
from_pidgin_string→conform_body(resolvesstoredDatato Gemini File API) →generate_text→ text description streams back over SSE. Full multimodal pipeline end-to-end.
- Port
generate_textfunction (pidgin → conformBody → streamContent → merge text) - Grounding tools: Google Search, Google Maps, URL context
- Wire
get_generate_function_groupinto dev backendmain.py - Tests: 20 tests — handler, grounding, model resolution, error handling
🎯 Objective: Ask the agent to "generate an image of a cat" through the dev backend. The agent calls
generate_image→executeStepwithai_image_tool→ One Platform returns inline image data → saved to agent FS.
- Port
executeStepclient (POST to/v1beta1/executeStep, collect output chunks) - Port
generate_imagefunction (prompt + optional input images + aspect ratio →executeStep→ save to FS) - Wire into dev backend
main.py - Tests
🎯 Objective: Ask the agent to "make a short video of waves crashing." The agent calls
generate_video→executeStepwith Veo model →storedDatapart saved to agent FS.
- Port
generate_videofunction (prompt + optional reference images →executeStepwithgenerate_videoAPI + Veo model selection) - Port
expandVeoErrorsafety-code mapping - Wire into dev backend
main.py - Tests
🎯 Objective: Ask the agent to "read this paragraph aloud." The agent calls
generate_speech_from_text→executeStep→ audiostoredDatasaved to agent FS.
- Port
generate_speech_from_textfunction (text + voice selection →executeStep→ save audio to FS) - Wire into dev backend
main.py - Tests
🎯 Objective: Ask the agent to "compose upbeat background music." The agent calls
generate_music_from_text→executeStep→ audiostoredDatasaved to agent FS.
- Port
generate_music_from_textfunction (prompt →executeStep→ save audio to FS) - Wire into dev backend
main.py - Tests
🎯 Objective: Ask the agent to "calculate the first 20 Fibonacci numbers." The agent calls
generate_and_execute_code→conformBody→streamContentwithcodeExecutiontool → streams text + inline file results back.
- Port
generate_and_execute_codefunction (prompt →conformBody→streamContentwith code execution tool → merge text + file results) - Wire into dev backend
main.py - Tests
🎯 Objective: Open graph editor, use AI chat to edit a graph through the dev backend. Each interaction round-trips as: POST → stream → suspend → POST → stream → complete.
Design decision: reconnect, not keepalive. The SSE stream closes when the loop suspends. The client POSTs again with
{interactionId, response}to resume. This is the only viable approach — suspends can last seconds, hours, or days. A keepalive stream cannot stay open that long, and the production backend is stateless. The dev backend must match this model.
-
SuspendError+SuspendResultinopal-backend-shared -
InteractionStore— in-memory state store keyed byinteractionId - Loop catches
SuspendError, returnsSuspendResult - Dev backend:
_start()/_resume()/_stream_loop()with state save/load - Client
SSEAgentEventSource: reconnect loop — suspend → await handler → POST resume → new stream - Tests: 11 Python (suspend/resume round-trip) + TS reconnect test
- Port
chat_request_user_input+chat_present_choicesto Python (raisesSuspendErrorwithwaitForInput/waitForChoice) -
waitForInputhandler ininvokeRemoteAgent— prompt display + input - Loop: suppress
on_finishon suspend, suppresson_starton resume - End-to-end: Generate step (Agent mode, prompt: "ask user to provide their
name") through
dev:backend— suspend → user types → resume → complete
🎯 Objective: Both
waitForInputandwaitForChoicework identically through local and remote paths. The rendering logic (prompt display, input collection, choice presentation) is shared — not duplicated betweenAgentUIandinvokeRemoteAgent.
- Extract
#addChatOutputpattern into a shared utility (chat-output.ts) usable by both local (AgentUI) and remote (invokeRemoteAgent) paths -
waitForInputremote handler uses the shared utility (deduplicate) -
waitForChoiceremote handler withChoicePresenter+A2UIInteraction(proper choice buttons / checkboxes) -
A2UIInteractionextracted as shared rendering core —AgentUIdelegates to it, adding only pidgin translation on top - Server-side pidgin resolution for choice labels (
from_pidgin_stringinchat.py) — images and file refs render correctly in choices - End-to-end: choice-based interaction through
dev:backend - Port fidelity audit — all function groups (
image,video,audio,generate,chat,system) verified against TS source; descriptions restored;shared_schemas.pycentralizesstatusUpdateSchema,taskIdSchema,fileNameSchema
🎯 Objective:
opal-backend-sharedcode is pure Python (nohttpx,fastapi, orpydanticimports in synced files). The shared code can be copybara'd tothird_party/py/opal_backendand the production backend can inject its own HTTP transport.# Verify: no transport deps in synced code grep -r "import httpx\|from fastapi\|from sse_starlette" \ packages/opal-backend/opal_backend/*.py \ packages/opal-backend/opal_backend/functions/ # → no results
- Merge
opal-backend-shared,opal-backend-dev,opal-backend-fakeinto singlepackages/opal-backend/with onepyproject.tomland one.venv - Directory structure:
opal_backend/(synced),opal_backend/local/(not synced),opal_backend/dev/,opal_backend/fake/ - Move fake-server artifacts (
events.py,sse_sink.py,pending_requests.py) tolocal/(not synced to google3) - Update all imports across source and test files
- Update wireit scripts (
setup:python,test:python,dev:backend,dev:fake)
-
http_client.py—HttpClientprotocol (synced, no deps) -
local/http_client_impl.py—httpx-based implementation (not synced) - Update
gemini_client.py— replaceimport httpxwithHttpClient - Update
conform_body.py— replaceimport httpxwithHttpClient - Update
step_executor.py— replaceimport httpxwithHttpClient - Thread
HttpClientthroughLoopand function group factories - Update all tests to inject
HttpClient
-
opal_backend/events.py— dataclass models for all 22AgentEventtypes,AgentResult,FileData, request/response bodies (StartRunRequest,ResumeRunRequest), segment types — all withto_dict()producing camelCase JSON (no pydantic dependency) -
agent_events.py—AgentEventSinkqueue andbuild_hooks_from_sinkemit typed events instead ofdict[str, Any] -
suspend.py—SuspendErrortakes typedSuspendEvent+ explicitfunction_call_partparameter (moved out of event dict) -
loop.py— imports wire-format types fromevents.py -
functions/chat.py— constructsWaitForInputEvent/WaitForChoiceEvent -
dev/main.py— SSE serialization viaevent.to_dict()+json.dumps() -
local/sse_sink.py— usesto_dict()instead of Pydanticmodel_dump_json() - Delete
local/events.py(replaced by syncedevents.py) - Update all tests (
test_agent_events.py,test_chat_functions.py,test_suspend_resume.py)
-
backend_client.py—BackendClientprotocol (execute_step,upload_gemini_file,upload_blob_file) +HttpBackendClientimpl -
step_executor.py— delegates toBackendClient, removed HTTP logic -
conform_body.py— delegates toBackendClient.upload_gemini_file -
loop.py—upstream_base: str→backend: BackendClient | None - All function groups (
image,audio,video,generate) updated -
dev/main.py— per-requestHttpBackendClientwithorigin - Tests updated (step_executor, conform_body, suspend_resume)
🎯 Objective:
access_tokenandoriginare no longer parameters ofLoop, function group factories, orconform_body. Credentials are a transport concern — baked intoHttpClientandBackendClient.
-
HttpClientprotocol —access_tokenproperty -
HttpxClient— accepts and exposesaccess_token -
GeminiClient— readsclient.access_token -
BackendClientprotocol — removedaccess_tokenfrom all methods;HttpBackendClientreads fromself._client.access_token -
conform_body— removedaccess_tokenparam -
step_executor— removedaccess_tokenfrom functions - All 4 function group factories — removed
access_tokenparam -
Loop.__init__— removedaccess_tokenandorigin -
dev/main.py— per-requestHttpxClientwith baked-in token; removedaccess_token/originfromInteractionState - Tests updated (all 322 pass)
🎯 Objective:
opal_backend.run()is a single async iterator that takes an objective + injected deps and yieldsAgentEvents. Consumers provide only what varies by environment. Everything else is internal.async for event in opal_backend.run( objective=objective, client=http_client, backend=backend_client, store=interaction_store, ): yield event
-
opal_backend/run.py—run()async generator: createsAgentFileSystem,TaskTreeManager,AgentEventSink,LoopController, builds function groups internally, runs loop, dispatchesCompleteEvent/ErrorEvent, closes sink. Accepts extensibleflagsdict. -
opal_backend/run.py—resume()async generator: loads state fromInteractionStore, injects function response, rebuilds function groups, runs loop, dispatches events -
dev/main.py— delegates toopal_backend.run()/resume(); deleted_build_function_groups,_stream_loop(~130 lines removed) -
opal_backend/__init__.py— re-exportrun,resume - Tests:
test_run.py— 6 tests (complete, error, suspend, failed, unknown ID, full suspend→resume round-trip). All 328 pass.
🎯 Objective: Core modules depend only on protocols. All transport-specific implementations live in
local/.
-
backend_client.py— protocol-only (removedHttpBackendClient,HttpClientimport, logging) -
local/backend_client_impl.py—HttpBackendClientmoved here -
interaction_store.py— convertedInteractionStoreclass toProtocol -
local/interaction_store_impl.py—InMemoryInteractionStoremoved here - Updated imports:
dev/main.py,test_step_executor.py,test_suspend_resume.py,test_run.py(328 pass)
🎯 Objective: Agents running on the Python backend can use persistent Google Sheets-backed memory. Chat conversations are persisted and recalled across sessions. The
graphidentity is required and anchors all storage.
-
DriveOperationsClientprotocol (9 methods: Drive CRUD + Sheets read/write/batch) +HttpDriveOperationsClientimpl (#8089) -
InteractionStoreprotocol made async (save,load,has,clear);InMemoryInteractionStoreupdated; flags/graph folded into stored interaction state (#8089) -
AgentFileSystemmade async (get,list_files,read_text,get_many,from_pidgin_string) — prep for database-backed storage (#8090) -
SheetManager— ported from TSSheetManager+memorySheetGetter; resolves spreadsheet IDs viaDriveOperationsClient.query_files, creates / reads / updates / deletes sheets. 21 tests. (#8091) - 5 memory functions (
create_sheet,read_sheet,update_sheet,delete_sheet,get_metadata) ported to Python with JSON schemas matching TS exactly. 12 tests. Wired intorun(). (#8091) -
AgentFileSystem—/mnt/memory/path resolution viaset_sheet_manager(). 4 tests. (#8091) -
ChatLogManager— seeds from existing sheets, persists new entries, registers chat log as system file. Wired intorun()andresume(). (#8092) -
graphpromoted to required sibling field (out offlags) across full stack; every run anchored to graph identity (#8092) -
session_idinInteractionStatefor cross-suspend chat log continuity (#8092) -
_process_chat_response— transforms{input: LLMContent}→{user_input: text}on resume, matching TS handler logic (#8092) -
.on("error")handler ininvokeRemoteAgent— server errors now surface to users (#8092) - Tests: 435 Python + 31 TS agent tests pass
Systematic audit-and-fix pass across all function groups to ensure the Python backend matches TS behavior exactly. Also adds fatal error classification so unrecoverable errors (quota-exhausted) terminate the loop immediately instead of being retried by the LLM.
-
error_classifier.py—to_error_or_response()classifies errors by parsing structured JSON (RESOURCE_EXHAUSTED) or fuzzy keyword matching; promotes quota-exhausted to$erroroutcomes. 16 tests. (#8071) -
function_caller.py+loop.py— propagate and handle$errorfrom function results (#8071) - All generation handlers (
video,image,audio,generate) route errors throughto_error_or_response(#8071) -
generate.pyport fidelity —mergeTextPartsutility,statusUpdaterTODOs, model resolution alignment. 20 tests. (#8070, #8081, #8087) -
image.py— batch resolution viafile_system.get_many(), output- processing drift fix (get_many+ 5 tests) (#8082, #8088) -
chat.py—computeFormat(inputType→ icon name mapping) ported;from_pidgin_stringresolution for choice labels (#8095) -
system.py— 3 drift fixes: route resolution viaget_original_route(href), intermediate file LLMContent format,hrefinfailed_to_fulfill. 5 tests. (#8096) -
BackendClient.stream_generate_content— consolidated all Gemini streaming intoBackendClient, eliminatedHttpClientprotocol entirely (#8074, #8075) -
ENABLE_GEMINI_BACKENDflag — routes Gemini calls through backend (#8076) - Tests: 441 Python tests pass
Full parity with the in-process agent. Everything that works locally works identically through the dev backend.
- Status metadata plumbing —
StatusUpdateOptions(expectedDurationInSec,isThought) flows throughStatusUpdateCallback→FunctionCallUpdateEvent.opts→ SSE wire format. All 4 TODOs ingenerate.pyresolved. -
url_contextconsent flow —FunctionDefinitiongains an optionalpreconditionhandler, run byFunctionCallerbefore the main handler. The consent precondition raisesSuspendError(QueryConsentEvent)withis_precondition_check=True. On resume,_resume_preconditionrecords the grant and re-dispatches the function call — the model never sees the consent round-trip. - Cancel concurrent function caller tasks on suspend —
loop.pynow cancels siblingasyncio.Tasks before saving state, eliminating the "emit() called on closed sink" warning. -
contentevent — investigation result: only consumed in local mode (RunStateManager.pushContent), ignored by remote client. Removed server-side emission frombuild_hooks_from_sinkand loop. The event can be fully removed ifLocalAgentRunis removed. - Segment data-parts transfer — moved
to_pidgin()insiderun()so segments are converted using the loop's ownAgentFileSystem. Data parts fromasset/inputsegments now survive into the loop.run()acceptssegments(notobjective) as its primary input;dev/main.pypasses segments straight through. - Chat resume multimodal parts —
_process_chat_responseusescontent_to_pidgin_stringto register binary parts (images, file uploads) inAgentFileSystemand produce pidgin text with<file>tags, matching the TS local path behavior.