Letta Server 0.16.7 Release Notes

173 commits since 0.16.6 | Released March 31, 2026

Highlights

Self-hosted users: this is a big upgrade. The default global context window is raised from 32k to 128k, the context window reset bug (LET-7991) is fixed, and compaction has been overhauled. If you've been running curl commands to patch your config after every ADE load, most of that pain should be gone.

Breaking Changes

Block limits are no longer enforced -- block limit validation has been deprecated and removed from the git memory sync path (#9977, #9983). Blocks can now grow freely. If you were relying on limits to cap per-turn cost, you'll need to manage block size via other means.

Context Window & Compaction (21 fixes)

The biggest category of fixes. Self-hosted users were hit hardest by these.

Global context window default raised from 32k to 128k (#9993) -- self-hosted servers no longer default to 32k for unknown models
Context window preserved on conversation model override (LET-7991, #9986) -- the bug where non-default conversations fell back to 32k is fixed
Compaction overflow fixes (#9897) -- addresses the double-compaction and runaway compaction loops
Compaction model resets on agent model change (#10031) -- switching your agent's model no longer leaves the old summarizer model behind
Summarizer prompt improved (#10314) -- now remembers plan files, GitHub PRs, and other structured content during summarization
BYOK summarization fixed (#10152) -- summarizer provider fallback no longer fires for BYOK requests
Better error surfacing -- context window exceeded errors now have descriptive messages (#10135, #10171), and system prompt size warnings during compaction (#10058)

Gemini (2 fixes)

thought_signature preserved on function calls without reasoning (LET-8166, #10237) -- the bug blocking all Gemini 2.5+/3.x multi-turn tool calling is fixed
Streaming interface crash fixed (#10306) -- self.model now initialized in SimpleGeminiStreamingInterface constructor (LET-8129)

Memory & memfs (10 fixes, 4 features)

available_skills block no longer duplicates in system prompt (#10006, #10011, #10021) -- three separate fixes for the skills block multiplying and inflating context (LET-8013)
Git memory sync deferred until stream close (#9951) -- reduces mid-stream sync failures
System prompt recompiles on agent creation with git memory (#9950) -- new git-enabled agents no longer start with empty compiled context
Projection-style git memory rendering (#10211) -- new rendering approach for memfs content in system prompts
Manual block edits via API trigger recompile (#9775) -- no more stale context after API block updates
Conversation recompile endpoint (#9848) -- POST /v1/conversations/{id}/recompile is now available

Conversations (7 features)

Conversation forking (#10234, #10263) -- fork conversations with shared message history, including the default conversation
Sort conversations by last_message_at (#10190)
Idempotent conversation streaming (#10147) -- OTID-based retry safety
Request-scoped system overrides (#10227) -- per-request system prompt modifications

Streaming & Reliability (13 fixes)

OTID retry hardening (#10229, #10209) -- stream resume with backoff, race condition fixes
Conversation lock released earlier (#10203) -- reduces contention on concurrent requests
Better error messages (#10207) -- known LLM errors now surface descriptive messages instead of generic failures
BYOK error tagging (#10204, #10311) -- errors now include is_byok flag for debugging
stream_incomplete diagnosis (#10033) -- BaseException catching to identify root causes

Model Support (14 features, 20 fixes)

GPT-5.4 -- full support including mini, nano, and fast variants (#9798, #10043)
GLM-5 -- GLM-5, GLM-5.1, GLM-5 Turbo, GLM-4.7 (#10317, #10285, #9994)
MiniMax M2.7 (#10093)
Baseten -- added as provider with full frontend integration, serverless auto mode, reasoning support (#10250, #9846, #9998)
Fireworks (#9780) and zAI coding provider (#10064)
Opus 4.6 / Sonnet 4.6 -- adaptive thinking tokens no longer incorrectly capped (#9795)
OpenAI proxy cleanup -- extra fields removed (#9949), parallel tool calling supported (#9879)

Security

Local filesystem access blocked via ImageContent bypass (#3256, #10329) -- file:/// URLs in images are now rejected
Internal MCP server targets blocked (#10009)
SECURITY.md added (#3228)

Infrastructure

Readiness enforcement scaffold (M1-M3 metrics pipeline) -- request pressure, DB pool, SSE lifecycle, event loop lag monitoring
Multi-agent tools moved to less privileged execution environment (#9779)
Subagent agents auto-hidden on create (#10096)
WebSocket transport for OpenAI Responses API (#9841)

For self-hosted users upgrading from 0.16.6: This release addresses the majority of issues reported in the community over the past month. The context window default change alone (#9993) eliminates the most common source of "everything breaks when I open ADE" complaints.

Full Changelog: 0.16.6...0.16.7

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v0.16.7

Choose a tag to compare

Sorry, something went wrong.