feat(dream): add SQLite backend for dream version control by JiajunBernoulli · Pull Request #3015 · HKUDS/nanobot

JiajunBernoulli · 2026-04-10T13:11:35Z

Replace git-based version control with SQLite to avoid conflicts with user's own git repositories. The SQLite backend provides the same API as GitStore, ensuring seamless migration.

Changes

Add SQLiteStore class with git-compatible API
Add version_backend config option (default: sqlite)
Update MemoryStore to support both backends
Add comprehensive tests for SQLiteStore

Motivation

The current dream mechanism uses git for version control, which may conflict with user's own git repository in the workspace. This PR introduces SQLite as the default backend, avoiding such conflicts while maintaining full backward compatibility.

Configuration

agents:
  defaults:
    dream:
      version_backend: sqlite  # default, can omit
      # version_backend: git   # use legacy git backend

Closes #2980

…diately Dream Phase 2 uses fail_on_tool_error=True, which terminates the entire run on the first tool error (e.g. old_text not found in edit_file). Normal agent runs default to False so the LLM can self-correct and retry. Dream should behave the same way.

PyJWT and cryptography are optional msteams deps; they should not be bundled into the generic dev install. Tests now skip the entire file when the deps are missing, following the dingtalk pattern.

* feat(dream): enhance memory cleanup with staleness detection - Phase 1: add [FILE-REMOVE] directive and staleness patterns (14-day threshold, completed tasks, superseded info, resolved tracking) - Phase 2: add explicit cleanup rules, file paths section, and deletion guidance to prevent LLM path confusion - Inject current date and file sizes into Phase 1 context for age-aware analysis - Add _dream_debug() helper for observability (dream-debug.log in workspace) - Log Phase 1 analysis output and Phase 2 tool events for debugging Tested with glm-5-turbo: MEMORY.md reduced from 149 to 108-129 lines across two rounds, correctly identifying and removing weather data, detailed incident info, completed research, and stale discussions. * refactor(dream): replace _dream_debug file logger with loguru Remove the custom _dream_debug() helper that wrote to dream-debug.log and use the existing loguru logger instead. Phase 1 analysis is logged at debug level, tool events at info level — consistent with the rest of the codebase and no extra log file to manage. * fix(dream): make stale scan independent of conversation history Reframe Phase 1 from a single comparison task to two independent tasks: history diff AND proactive stale scan. The LLM was skipping stale content that wasn't referenced in conversation history (e.g. old triage snapshots). Now explicitly requires scanning memory files for staleness patterns on every run. * fix(dream): correct old_text param name and truncate debug log - Phase 2 prompt: old_string -> old_text to match EditFileTool interface - Phase 1 debug log: truncate analysis to 500 chars to avoid oversized lines * refactor(dream): streamline prompts by separating concerns Phase 1 owns all staleness judgment logic; Phase 2 is pure execution guidance. Remove duplicated cleanup rules from Phase 2 since Phase 1 already determines what to add/remove. Fix remaining old_string -> old_text. Total prompt size reduced ~45% (870 -> 480 tokens). * fix(dream): add FILE-REMOVE execution guidance to Phase 2 prompt Phase 2 was only processing [FILE] additions and ignoring [FILE-REMOVE] deletions after the cleanup rules were removed. Add explicit mapping: [FILE] → add content, [FILE-REMOVE] → delete content.

ExecTool hardcoded bash, breaking exec on Windows. Now uses cmd.exe via COMSPEC on Windows with a curated minimal env (PATH, SYSTEMROOT, etc.) that excludes secrets. bwrap sandbox gracefully skips on Windows.

- test_exec_head_tail_truncation: use temp script file instead of python -c to avoid cmd.exe quote-parsing issues after PR HKUDS#2893 - test_grep_files_with_matches_supports_head_limit_and_offset: query full result set first to avoid mtime-dependent sort assumption

* feat(feishu): add done emoji support for reaction lifecycle * feat(feishu): add done emoji support and update documentation

exec tool hints previously used val[:40] which cut paths mid-segment (e.g. "D:\Documents\GitHub\nanobot.worktree…"). Now uses regex to detect file paths in commands and abbreviates them properly, with smart truncation at chain separators (&&, |, ;) as fallback.

Two improvements to Feishu streaming card experience: 1. Handle _resuming in send_delta: when a mid-turn _stream_end arrives with resuming=True (tool call between segments), flush current text to the card but keep the buffer alive so subsequent segments append to the same card instead of creating a new one. 2. Inline tool hints into streaming cards: when a tool hint arrives while a streaming card is active, append it to the card content (e.g. "🔧 web_fetch(...)") instead of sending a separate card. The hint is automatically stripped when the next delta arrives. Made-with: Cursor

Three fixes for inline tool hints: 1. Consecutive tool hints now replace the previous one instead of stacking — the old suffix is stripped before appending the new one. 2. When _resuming flushes the buffer, any trailing tool hint suffix is removed so it doesn't persist into the next streaming segment. 3. When final _stream_end closes the card, tool hint suffix is cleaned from the text before the final card update. Adds 3 regression tests covering all three scenarios. Made-with: Cursor

Tool hints should be kept as permanent content in the streaming card so users can see which tools were called (matching the standalone card behavior). Previously, hints were stripped when new deltas arrived or when the stream ended, causing tool call information to disappear. Now: - New delta: hint becomes permanent content, delta appends after it - New tool hint: replaces the previous hint (unchanged) - Resuming/stream_end: hint is preserved in the final text Updated 3 tests to verify hint preservation semantics. Made-with: Cursor

…splay Two display fixes based on real-world Feishu testing: 1. tool_hints.py: format_tool_hints now deduplicates by comparing the fully formatted hint string instead of tool name alone. This fixes `ls /Desktop` and `ls /Downloads` being incorrectly merged as `ls /Desktop × 2`. Truly identical calls still fold correctly. (_group_consecutive and all abbreviation logic preserved unchanged.) 2. feishu.py: inline tool hints now display one tool per line with 🔧 prefix, and use double-newline trailing to prevent Setext heading rendering when followed by markdown `---`. Made-with: Cursor

…_delta for throttling - Make tool_hint_prefix configurable in FeishuConfig (default: 🔧) - Delegate tool hint card updates from send() to send_delta() so hints automatically benefit from _STREAM_EDIT_INTERVAL throttling - Fix staticmethod calls to use self.__class__ instead of self - Document all supported metadata keys in send_delta docstring - Add test for empty/whitespace-only tool hint with active stream buffer

- Add explicit error logging for missing file_key and message_id - Add logging for download failures - Change audio extension from .opus to .ogg for better Whisper compatibility - Feishu voice messages are opus in OGG container; .ogg is more widely recognized

Port Python implementation from a1ec7b1 (websocket channel module and channel tests; excludes webui debug app).

…elta

- Use hmac.compare_digest for timing-safe static token comparison - Add issued token capacity limit (_MAX_ISSUED_TOKENS=10000) with 429 response - Use atomic pop in _take_issued_token_if_valid to eliminate TOCTOU window - Enforce TLSv1.2 minimum version for SSL connections - Extract _safe_send helper for consistent ConnectionClosed handling - Move connection registration after ready send to prevent out-of-order delivery - Add HTTP-level allow_from check and client_id truncation in process_request - Make stop() idempotent with graceful shutdown error handling - Normalize path via validator instead of leaving raw value - Default websocket_requires_token to True for secure-by-default behavior - Add integration tests and ws_test_client helper - Refactor tests to use shared _ch factory and bus fixture

Comprehensive guide covering wire protocol, configuration reference, token issuance, security notes, and common deployment patterns.

QQ channel improvements (on top of nightly): - Add top-level try/except in _on_message and send() for resilience - Use defensive getattr() for attachment attributes (botpy version compat) - Skip file_name for image uploads to avoid QQ rendering as file attachment - Extract only file_info from upload response to avoid extra fields - Handle protocol-relative URLs (//...) in attachment downloads WeCom channel improvements: - Add _upload_media_ws() for WebSocket 3-step media upload protocol - Send media files (image/video/voice/file) via WeCom rich media API - Support progress messages (plain reply) vs final response (streaming) - Support proactive send when no frame available (cron push) - Pass media_paths to message bus for downstream processing

- Use asyncio.to_thread for file I/O to avoid blocking event loop - Add 200MB upload size limit with early rejection - Fix file handle leak by using context manager - Free raw bytes early after chunking to reduce memory pressure - Add file attachments to media_paths (was text-only, inconsistent with image) - Use robust _sanitize_filename() instead of os.path.basename() for path safety - Remove re-raise in send() for consistency with QQ channel - Fix truncated media_id logging for short IDs

- Use asyncio.to_thread for file I/O to avoid blocking event loop - Add 200MB upload size limit with early rejection - Fix file handle leak by using context manager - Use memoryview for upload chunking to reduce peak memory - Add inbound download size check to prevent OOM - Use asyncio.to_thread for write_bytes in download path - Extract inline media_type detection to _guess_wecom_media_type()

Cover helpers (sanitize_filename, guess media type), outbound send (exception handling, media-then-text order, fallback), inbound message processing (attachments, dedup, empty content), _post_base64file payload filtering, and WeCom upload/download flows.

…oken cost and latency (HKUDS#2982) When a user is idle for longer than a configured TTL, nanobot **proactively** compresses the session context into a summary. This reduces token cost and first-token latency when the user returns — instead of re-processing a long stale context with an expired KV cache, the model receives a compact summary and fresh input.

Replace git-based version control with SQLite to avoid conflicts with user's own git repositories. The SQLite backend provides the same API as GitStore, ensuring seamless migration. - Add SQLiteStore class with git-compatible API - Add version_backend config option (default: sqlite) - Update MemoryStore to support both backends - Add comprehensive tests for SQLiteStore Closes HKUDS#2980

chengyongru · 2026-04-11T14:29:16Z

Overall: The SQLite backend itself looks clean and well-tested. The API compatibility with GitStore is solid. However, the backward compatibility story needs work before this can safely merge to nightly.

1. Default value `"sqlite"` silently migrates all existing users [CRITICAL]

# schema.py — new field, defaults to sqlite
version_backend: str = "sqlite"

Every existing user who upgrades will:

Lose access to their existing dream git commit history (it stays in .git/ but the runtime stops reading it)
Start fresh with an empty SQLite DB at memory/.dream_history.db

This is a silent behavior change. The default should remain "git" so existing users keep their current behavior, and only new installs or explicit opt-ins get SQLite:

version_backend: str = "git"  # preserve existing behavior

2. `helpers.py:init_workspace()` still hardcodes GitStore [MAJOR]

nanobot/utils/helpers.py:473-474 creates a GitStore unconditionally during onboard / workspace init. After this PR, the onboard flow initializes git, but the runtime agent uses SQLite by default — you end up with both .git/ and memory/.dream_history.db in the same workspace. This should either:

Read from the same config to decide which backend to init, or
Be updated to initialize through MemoryStore instead

3. No migration path for existing git history [MAJOR]

Users with existing dream commit histories have no way to bring them into SQLite. A one-time migration would be straightforward — read all commits from GitStore via log(), replay snapshots into SQLiteStore. Even a simple nanobot migrate-dream CLI command or a startup detection (.git exists with dream commits but no .dream_history.db → offer migration) would help.

4. Duplicate `CommitInfo` definition [MINOR]

CommitInfo is defined identically in both gitstore.py:14 and sqlitestore.py:17. Consider extracting it to a shared location (e.g. nanobot/utils/version_store.py) to avoid drift.

5. Property name `git` returns `SQLiteStore` [NIT]

# memory.py:58
def git(self) -> GitStore | SQLiteStore:
    """Version store for dream history (GitStore or SQLiteStore)."""
    return self._version_store

The property is called git but may return a SQLiteStore. Consider renaming to version_store or adding an alias.

Summary: The SQLite implementation itself looks good. I'd suggest flipping the default to "git", syncing helpers.py, and adding a migration path.

- Change version_backend default from 'sqlite' to 'git' for backward compatibility - Fix helpers.py to respect version_backend config instead of hardcoding GitStore - Add migrate-dream CLI command for migrating git history to SQLite - Extract CommitInfo to shared version_store.py to avoid duplicate definitions - Add version_store property to MemoryStore (keep 'git' as backward compat alias) - Add tests for default version_backend value

…ackend param

…e_templates mocks

chengyongru and others added 27 commits April 5, 2026 22:09

Add Microsoft Teams channel on current nightly base

5857f7f

Fix MSTeams PR review follow-ups

8f0b653

fix(msteams): remove optional deps from dev extras and gate tests

3723cd7

PyJWT and cryptography are optional msteams deps; they should not be bundled into the generic dev install. Tests now skip the entire file when the deps are missing, following the dingtalk pattern.

Merge remote-tracking branch 'origin/main' into nightly

ba38d41

fix(exec): add Windows support for shell command execution

ae27d69

ExecTool hardcoded bash, breaking exec on Windows. Now uses cmd.exe via COMSPEC on Windows with a curated minimal env (PATH, SYSTEMROOT, etc.) that excludes secrets. bwrap sandbox gracefully skips on Windows.

feat(feishu): add done emoji support for reaction lifecycle (HKUDS#2899)

473637c

* feat(feishu): add done emoji support for reaction lifecycle * feat(feishu): add done emoji support and update documentation

feat(channels): add WebSocket server channel and tests

e00dca2

Port Python implementation from a1ec7b1 (websocket channel module and channel tests; excludes webui debug app).

fix(websocket): handle ConnectionClosed gracefully in send and send_d…

d327c19

…elta

docs(websocket): add WebSocket channel documentation

3bece17

Comprehensive guide covering wire protocol, configuration reference, token issuance, security notes, and common deployment patterns.

fix: strip <thought> blocks from Gemma 4 and similar models

7b1ce24

JiajunBernoulli mentioned this pull request Apr 10, 2026

[Bug]: Dream git store initializes a nested repo in workspace/ and overwrites workspace/.gitignore #2980

Open

2 tasks

github-actions bot mentioned this pull request Apr 11, 2026

🦞 OpenClaw 生态日报 2026-04-11 gsscsd/big_model_radar#168

Open

chengyongru added the enhancement New feature or request label Apr 11, 2026

chengyongru added the invalid This doesn't seem right label Apr 11, 2026

JiajunBernoulli added 3 commits April 12, 2026 08:03

fix(tests): update sync_workspace_templates mocks to accept version_b…

a17d295

…ackend param

fix(tests): explicitly declare version_backend param in sync_workspac…

cc67180

…e_templates mocks

chengyongru force-pushed the nightly branch from c3b55ba to 217e1fc Compare April 12, 2026 14:55

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(dream): add SQLite backend for dream version control#3015

feat(dream): add SQLite backend for dream version control#3015
JiajunBernoulli wants to merge 30 commits intoHKUDS:nightlyfrom
JiajunBernoulli:dream-by-sqllite

JiajunBernoulli commented Apr 10, 2026

Uh oh!

chengyongru commented Apr 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

8 participants

Conversation

JiajunBernoulli commented Apr 10, 2026

Changes

Motivation

Configuration

Uh oh!

chengyongru commented Apr 11, 2026

1. Default value "sqlite" silently migrates all existing users [CRITICAL]

2. helpers.py:init_workspace() still hardcodes GitStore [MAJOR]

3. No migration path for existing git history [MAJOR]

4. Duplicate CommitInfo definition [MINOR]

5. Property name git returns SQLiteStore [NIT]

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

8 participants

1. Default value `"sqlite"` silently migrates all existing users [CRITICAL]

2. `helpers.py:init_workspace()` still hardcodes GitStore [MAJOR]

4. Duplicate `CommitInfo` definition [MINOR]

5. Property name `git` returns `SQLiteStore` [NIT]