Skip to content

Conversation

@vishesh92
Copy link

@vishesh92 vishesh92 commented Oct 13, 2025

This pull request introduces a comprehensive set of configuration options for performance tuning and search quality in SeaGOAT. It adds new server-side configuration sections for vector search, text search, engine processing, and query defaults, allowing users to fine-tune behavior for different repository sizes and hardware constraints. The codebase is updated to use these new config values throughout, replacing previous hardcoded limits and improving flexibility.

Configuration System Enhancements

  • Added new configuration sections (chroma, ripgrep, engine, query) to docs/configuration.md and seagoat/utils/config.py, with schema validation and sensible defaults for vector search, text search, engine processing, and query parameters. [1] [2] [3] [4]
  • Updated documentation to include detailed descriptions of new settings and practical performance tuning examples for various scenarios (large repos, memory-constrained systems, faster analysis, better search quality).

Engine and Query Processing

  • The engine now dynamically determines the minimum number of chunks to analyze and the number of worker threads based on configuration, replacing fixed values. [1] [2]
  • Query endpoints in seagoat/server.py use configurable defaults for result limits and context lines, improving usability and consistency. [1] [2] [3]

Vector Search Improvements

  • All ChromaDB vector search logic now uses configurable values for maximum vector distance, chunk fetch limits, and result over-fetching, replacing previous constants. [1] [2] [3]

Ripgrep Text Search Improvements

  • Ripgrep caching and search logic now use configurable file size and memory-mapped cache limits, improving scalability and resource control. [1] [2] [3]
  • Ripgrep search results are scored using the configured vector distance for consistency with vector search. [1] [2]

These changes make SeaGOAT much more configurable, allowing users to optimize performance and search quality for their specific use case.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant