Skip to content

fix: scope database health check to current repo to enable concurrent sessions#535

Closed
lchachurski wants to merge 2 commits intoshotgun-sh:mainfrom
lchachurski:fix/scope-database-health-check-to-current-repo
Closed

fix: scope database health check to current repo to enable concurrent sessions#535
lchachurski wants to merge 2 commits intoshotgun-sh:mainfrom
lchachurski:fix/scope-database-health-check-to-current-repo

Conversation

@lchachurski
Copy link
Copy Markdown
Contributor

@lchachurski lchachurski commented Apr 10, 2026

Summary

Eliminates all cross-repo database locking that prevents concurrent Shotgun sessions on different repositories. The fix touches every code path that previously opened all .kuzu databases in ~/.shotgun-sh/codebases/.

Changes

manager.py — core engine:

  • Adds .meta.json sidecar files written alongside each .kuzu database, containing repo_path, name, and indexed_from_cwds. These enable directory-based filtering without acquiring any Kuzu file locks.
  • Sidecar lifecycle: created on graph init, updated on CWD access changes, removed on graph/database deletion.
  • list_graphs(graph_ids=None) — new optional parameter to open only specified databases instead of globbing all.
  • cleanup_corrupted_databases(graph_ids=None) — same scoping pattern.
  • detect_database_issues(graph_ids=None) — same scoping pattern (from first commit).
  • Consolidated duplicated loop bodies into single candidate-list + iteration pattern.

service.py — service layer:

  • list_graphs_for_directory() now reads .meta.json sidecars first to identify candidates, then opens only matching databases. Falls back to full scan when no sidecars exist (backward compat with pre-existing databases).

Agent tools (file_read.py, directory_lister.py, codebase_shell.py):

  • Replaced list_graphs() + linear search with direct get_graph(graph_id) when the caller already knows the graph_id.

chat_screen.py:

  • Scoped cleanup_corrupted_databases() call to the graph being indexed.
  • Scoped detect_database_issues() retry flows to relevant graph_ids.

app.py:

  • Scoped startup detect_database_issues() to current repo's graph_id only.

Root Cause

On startup, detect_database_issues() globbed every .kuzu file in ~/.shotgun-sh/codebases/ and attempted to open each one. When another Shotgun instance was already running on a different repository, its database file was exclusively locked by Kuzu. The health check reported this as a "locked database" issue, and the TUI showed a blocking dialog — even though the current repo's database was perfectly accessible.

The same pattern existed in list_graphs(), cleanup_corrupted_databases(), and list_graphs_for_directory() — all opening every database for every operation.

Test Results

Tested with real_ladybug==0.15.1 against 3 simulated repos (atoma, auto-research, voice-feedback-analytics — mirroring the actual layout) using multiprocessing to hold database locks:

T1 OLD blocks with peer locks:      PASS (bug reproduced — 2 locked, TUI would block)
T2 NEW scoped detect:               PASS (only checks own database)
T3 list_graphs scoped:              PASS (opens 1/3, not 3/3)
T4 sidecar filtering 3 repos:       PASS (each CWD sees only its own graph)
T5 unrelated CWD empty:             PASS (no false positives)
T6 scoped cleanup:                  PASS (corrupted DB found, locked peers untouched)
T7 backward compat:                 PASS (full scan works when no sidecars/locks)
T8 uniqueness:                      PASS (3 distinct graph_ids)

Known Limitation

The list_graphs() method (used by codebase_shell when no graph_id is specified) still falls back to a full scan. This is the "show all indexed codebases" use case — rare during normal operation but could be improved with a registry approach in the future.

Test Plan

  • Automated 8-test suite with real_ladybug confirming bug reproduction and fix across 3 repos
  • Manual test: start Shotgun in 3 different repos simultaneously — all should run concurrently
  • Verify single-instance startup is unaffected (backward compat)
  • Verify .meta.json files are created during indexing and cleaned up on deletion

Closes #534

… sessions

On startup, detect_database_issues() scanned every .kuzu file in
~/.shotgun-sh/codebases/. When another Shotgun instance was running on
a different repo, its database appeared locked, triggering a blocking
"database locked" dialog — even though the current repo's database was
perfectly accessible.

Changes:
- Add optional graph_ids parameter to detect_database_issues() so
  callers can limit the check to specific databases.
- Consolidate the two scan branches into a single loop over candidates.
- In app.py run()/serve(), compute the current repo's graph_id and
  pass it so only that database is health-checked at startup.
- In chat_screen.py retry logic, scope re-checks to the specific
  graph_ids that had issues.

When graph_ids is None (e.g. list_graphs), the full glob scan is
preserved for backward compatibility.

Closes shotgun-sh#534

Made-with: Cursor
@lchachurski lchachurski force-pushed the fix/scope-database-health-check-to-current-repo branch from 5e9374a to e6a16cf Compare April 10, 2026 13:51
The initial fix scoped only detect_database_issues(). This commit
extends isolation to every code path that previously opened all .kuzu
databases in ~/.shotgun-sh/codebases/:

manager.py:
- Add .meta.json sidecar files written alongside each .kuzu database.
  These tiny JSON files contain repo_path, name, and indexed_from_cwds,
  enabling directory-based filtering without acquiring any Kuzu locks.
- Sidecar lifecycle: written on graph creation (_initialize_graph_metadata),
  updated on add/remove_cwd_access, deleted on delete_graph/delete_database.
- list_graphs(graph_ids=None): new optional parameter to open only
  specified databases instead of globbing all .kuzu files.
- cleanup_corrupted_databases(graph_ids=None): same scoping pattern.

service.py:
- list_graphs_for_directory() now reads .meta.json sidecars first to
  identify candidate graph_ids, then opens only matching databases.
  Falls back to full scan when no sidecars exist (backward compat).

Agent tools (file_read, directory_lister, codebase_shell):
- Replace list_graphs() + linear search with direct get_graph(graph_id)
  when the caller already knows the graph_id.

chat_screen.py:
- Scope cleanup_corrupted_databases() to the graph being indexed.

Tested with 3 simulated repos (atoma, auto-research, voice-feedback)
under real Kuzu file locking via multiprocessing. All 8 tests pass.

Closes shotgun-sh#534

Made-with: Cursor
lchachurski added a commit to lchachurski/shotgun that referenced this pull request Apr 13, 2026
Adds a process-wide class variable _scope_graph_ids to
CodebaseGraphManager, set once at startup from the CWD. A single
_iter_databases() helper replaces every .glob("*.kuzu") call, so all
existing and future enumeration methods automatically respect the scope.

Root cause: detect_database_issues(), list_graphs(), and
cleanup_corrupted_databases() all globbed every .kuzu file in
~/.shotgun-sh/codebases/. When another Shotgun session was running on
a different repo, its database was exclusively locked by Kuzu. The
health check reported this as an issue, showing a blocking "database
locked" dialog — even though the current repo's database was fine.

Changes:
- manager.py: add _scope_graph_ids ClassVar (default None) and
  _iter_databases() method. Replace 3 glob sites with _iter_databases().
- main.py: set _scope_graph_ids = [generate_graph_id(cwd)] once in
  main() before any subcommand runs. Covers TUI, shotgun run, and all
  other CLI commands in a single place.
- cli/codebase/commands.py: unset scope in list_codebases() — the one
  command that legitimately needs to enumerate all indexed databases.

This approach is ~15 lines of meaningful change across 3 files with
zero parameter passing. Any future method that enumerates databases
uses _iter_databases() and automatically inherits the scope.

Alternative approach: PR shotgun-sh#535 patches each method individually using
optional graph_ids parameters and .meta.json sidecar files.

Tested with real_ladybug==0.15.1 against 3 simulated repos under real
Kuzu file locks (multiprocessing). All 8 tests pass:
  T1: bug reproduced (OLD globs all, 2 locked, blocks 3rd session)
  T2: fix validates (NEW scope, alpha/beta never opened)
  T3: list_graphs scoped (opens 1/3 not 3/3)
  T4: each CWD yields correct scope isolation
  T5: unrelated CWD returns empty
  T6: scoped cleanup avoids locked peers
  T7: backward compat (scope=None full scan still works)
  T8: graph_id uniqueness across 3 repos

Closes shotgun-sh#534

Made-with: Cursor
scottfrasso pushed a commit that referenced this pull request Apr 14, 2026
…ns (#538)

Adds a process-wide class variable _scope_graph_ids to
CodebaseGraphManager, set once at startup from the CWD. A single
_iter_databases() helper replaces every .glob("*.kuzu") call, so all
existing and future enumeration methods automatically respect the scope.

Root cause: detect_database_issues(), list_graphs(), and
cleanup_corrupted_databases() all globbed every .kuzu file in
~/.shotgun-sh/codebases/. When another Shotgun session was running on
a different repo, its database was exclusively locked by Kuzu. The
health check reported this as an issue, showing a blocking "database
locked" dialog — even though the current repo's database was fine.

Changes:
- manager.py: add _scope_graph_ids ClassVar (default None) and
  _iter_databases() method. Replace 3 glob sites with _iter_databases().
- main.py: set _scope_graph_ids = [generate_graph_id(cwd)] once in
  main() before any subcommand runs. Covers TUI, shotgun run, and all
  other CLI commands in a single place.
- cli/codebase/commands.py: unset scope in list_codebases() — the one
  command that legitimately needs to enumerate all indexed databases.

This approach is ~15 lines of meaningful change across 3 files with
zero parameter passing. Any future method that enumerates databases
uses _iter_databases() and automatically inherits the scope.

Alternative approach: PR #535 patches each method individually using
optional graph_ids parameters and .meta.json sidecar files.

Tested with real_ladybug==0.15.1 against 3 simulated repos under real
Kuzu file locks (multiprocessing). All 8 tests pass:
  T1: bug reproduced (OLD globs all, 2 locked, blocks 3rd session)
  T2: fix validates (NEW scope, alpha/beta never opened)
  T3: list_graphs scoped (opens 1/3 not 3/3)
  T4: each CWD yields correct scope isolation
  T5: unrelated CWD returns empty
  T6: scoped cleanup avoids locked peers
  T7: backward compat (scope=None full scan still works)
  T8: graph_id uniqueness across 3 repos

Closes #534

Made-with: Cursor
@lchachurski
Copy link
Copy Markdown
Contributor Author

better solution selected

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[BUG] Cannot run concurrent sessions on different repositories - startup locks ALL codebase databases

1 participant