Letta Server 0.16.7 Release Notes
173 commits since 0.16.6 | Released March 31, 2026
Highlights
Self-hosted users: this is a big upgrade. The default global context window is raised from 32k to 128k, the context window reset bug (LET-7991) is fixed, and compaction has been overhauled. If you've been running curl commands to patch your config after every ADE load, most of that pain should be gone.
Breaking Changes
- Block limits are no longer enforced -- block limit validation has been deprecated and removed from the git memory sync path (#9977, #9983). Blocks can now grow freely. If you were relying on limits to cap per-turn cost, you'll need to manage block size via other means.
Context Window & Compaction (21 fixes)
The biggest category of fixes. Self-hosted users were hit hardest by these.
- Global context window default raised from 32k to 128k (#9993) -- self-hosted servers no longer default to 32k for unknown models
- Context window preserved on conversation model override (LET-7991, #9986) -- the bug where non-default conversations fell back to 32k is fixed
- Compaction overflow fixes (#9897) -- addresses the double-compaction and runaway compaction loops
- Compaction model resets on agent model change (#10031) -- switching your agent's model no longer leaves the old summarizer model behind
- Summarizer prompt improved (#10314) -- now remembers plan files, GitHub PRs, and other structured content during summarization
- BYOK summarization fixed (#10152) -- summarizer provider fallback no longer fires for BYOK requests
- Better error surfacing -- context window exceeded errors now have descriptive messages (#10135, #10171), and system prompt size warnings during compaction (#10058)
Gemini (2 fixes)
- thought_signature preserved on function calls without reasoning (LET-8166, #10237) -- the bug blocking all Gemini 2.5+/3.x multi-turn tool calling is fixed
- Streaming interface crash fixed (#10306) --
self.modelnow initialized inSimpleGeminiStreamingInterfaceconstructor (LET-8129)
Memory & memfs (10 fixes, 4 features)
- available_skills block no longer duplicates in system prompt (#10006, #10011, #10021) -- three separate fixes for the skills block multiplying and inflating context (LET-8013)
- Git memory sync deferred until stream close (#9951) -- reduces mid-stream sync failures
- System prompt recompiles on agent creation with git memory (#9950) -- new git-enabled agents no longer start with empty compiled context
- Projection-style git memory rendering (#10211) -- new rendering approach for memfs content in system prompts
- Manual block edits via API trigger recompile (#9775) -- no more stale context after API block updates
- Conversation recompile endpoint (#9848) --
POST /v1/conversations/{id}/recompileis now available
Conversations (7 features)
- Conversation forking (#10234, #10263) -- fork conversations with shared message history, including the default conversation
- Sort conversations by last_message_at (#10190)
- Idempotent conversation streaming (#10147) -- OTID-based retry safety
- Request-scoped system overrides (#10227) -- per-request system prompt modifications
Streaming & Reliability (13 fixes)
- OTID retry hardening (#10229, #10209) -- stream resume with backoff, race condition fixes
- Conversation lock released earlier (#10203) -- reduces contention on concurrent requests
- Better error messages (#10207) -- known LLM errors now surface descriptive messages instead of generic failures
- BYOK error tagging (#10204, #10311) -- errors now include
is_byokflag for debugging - stream_incomplete diagnosis (#10033) -- BaseException catching to identify root causes
Model Support (14 features, 20 fixes)
- GPT-5.4 -- full support including mini, nano, and fast variants (#9798, #10043)
- GLM-5 -- GLM-5, GLM-5.1, GLM-5 Turbo, GLM-4.7 (#10317, #10285, #9994)
- MiniMax M2.7 (#10093)
- Baseten -- added as provider with full frontend integration, serverless auto mode, reasoning support (#10250, #9846, #9998)
- Fireworks (#9780) and zAI coding provider (#10064)
- Opus 4.6 / Sonnet 4.6 -- adaptive thinking tokens no longer incorrectly capped (#9795)
- OpenAI proxy cleanup -- extra fields removed (#9949), parallel tool calling supported (#9879)
Security
- Local filesystem access blocked via ImageContent bypass (#3256, #10329) --
file:///URLs in images are now rejected - Internal MCP server targets blocked (#10009)
- SECURITY.md added (#3228)
Infrastructure
- Readiness enforcement scaffold (M1-M3 metrics pipeline) -- request pressure, DB pool, SSE lifecycle, event loop lag monitoring
- Multi-agent tools moved to less privileged execution environment (#9779)
- Subagent agents auto-hidden on create (#10096)
- WebSocket transport for OpenAI Responses API (#9841)
For self-hosted users upgrading from 0.16.6: This release addresses the majority of issues reported in the community over the past month. The context window default change alone (#9993) eliminates the most common source of "everything breaks when I open ADE" complaints.
Full Changelog: 0.16.6...0.16.7