Skip to content

perf: prompt loop loads entire conversation history into memory on every step #18136

@BYK

Description

@BYK

Description

The prompt loop in prompt.ts calls filterCompacted(stream(sessionID)) on every iteration of its while(true) loop. For long-running sessions (e.g., 7,704 messages, 27,895 parts, ~91MB of data), this loads the entire conversation history into JS heap on each tool-call step.

The loaded WithParts[] array (~300MB after V8 object expansion) is then passed through toModelMessagesconvertToModelMessagesProviderTransform.messageconvertToLanguageModelPrompt — 4-5 copy layers creating ~60MB of wrapper objects each. With 10-50 tool-call steps per prompt, peak RSS reaches 4-8GB.

Two issues compound this:

  1. No context windowing: all messages are converted to ModelMessage format even though only ~200 fit in the LLM context window (~200K tokens ≈ 800KB of text)
  2. No compaction boundary optimization: filterCompacted streams through all messages loading parts eagerly, even for compacted sessions where only messages after the boundary are needed

Steps to reproduce

  1. Use opencode for several days with active sessions (1000+ messages)
  2. Start a prompt in a large session
  3. Monitor RSS: watch -n1 'grep VmRSS /proc/$(pgrep -f "opencode serve")/status'
  4. Observe RSS climbing to 4-8GB during tool-call loops

OpenCode version

0.1.35

OS

Linux (Ubuntu 24.04)

Metadata

Metadata

Assignees

Labels

coreAnything pertaining to core functionality of the application (opencode server stuff)perfIndicates a performance issue or need for optimization

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions