-
Notifications
You must be signed in to change notification settings - Fork 14.9k
perf: prompt loop loads entire conversation history into memory on every step #18136
Copy link
Copy link
Open
Labels
coreAnything pertaining to core functionality of the application (opencode server stuff)Anything pertaining to core functionality of the application (opencode server stuff)perfIndicates a performance issue or need for optimizationIndicates a performance issue or need for optimization
Description
Description
The prompt loop in prompt.ts calls filterCompacted(stream(sessionID)) on every iteration of its while(true) loop. For long-running sessions (e.g., 7,704 messages, 27,895 parts, ~91MB of data), this loads the entire conversation history into JS heap on each tool-call step.
The loaded WithParts[] array (~300MB after V8 object expansion) is then passed through toModelMessages → convertToModelMessages → ProviderTransform.message → convertToLanguageModelPrompt — 4-5 copy layers creating ~60MB of wrapper objects each. With 10-50 tool-call steps per prompt, peak RSS reaches 4-8GB.
Two issues compound this:
- No context windowing: all messages are converted to ModelMessage format even though only ~200 fit in the LLM context window (~200K tokens ≈ 800KB of text)
- No compaction boundary optimization:
filterCompactedstreams through all messages loading parts eagerly, even for compacted sessions where only messages after the boundary are needed
Steps to reproduce
- Use opencode for several days with active sessions (1000+ messages)
- Start a prompt in a large session
- Monitor RSS:
watch -n1 'grep VmRSS /proc/$(pgrep -f "opencode serve")/status' - Observe RSS climbing to 4-8GB during tool-call loops
OpenCode version
0.1.35
OS
Linux (Ubuntu 24.04)
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
coreAnything pertaining to core functionality of the application (opencode server stuff)Anything pertaining to core functionality of the application (opencode server stuff)perfIndicates a performance issue or need for optimizationIndicates a performance issue or need for optimization