An AI-powered semantic and keyword code search engine that helps developers quickly find relevant functions, classes, and snippets across large codebases.
AI Code Search is designed to solve the problem of finding specific code without knowing the exact keywords.
By combining the power of semantic search with traditional lexical methods, it provides a highly accurate and flexible way to navigate complex projects and monorepos.
Due to hardware compatibility there are small batches and usage of gemini and other small models + torch-cpu.
- π Semantic Search: Uses FAISS and embeddings to understand the intent behind queries, retrieving code snippets even if wording doesnβt match exactly.
- π Query Rewriting (Gemini API): Automatically expands and clarifies natural language queries using the Gemini API for improved results.
- π Multi-Language Support: Parses and indexes code from Python, C, C++, Go, and Java with the fast Tree-sitter parsing library.
- π· Function & Class Extraction: Indexes functions and classes as discrete units for more precise results compared to file-level searches.
- π Lexical Search: Falls back to keyword-based matching when semantic results are weak.
- π Basic Bug Search (Python): Includes regex-based detection for common Python issues and anti-patterns.
- Incremental Indexing: Index once and query as long as it's not modified. Improving speed.
-
Clone the repository and install dependencies.
Make sure Python 3.9 is installed. -
Create a
.envfile with:GEMINI_API=YOUR_GEMINI_API_KEY -
Run the following:
git clone https://github.com/kaifkh20/ai-codesearch.git cd ai-codesearch pip install -r req.txt
To index a codebase and begin searching, run:
# Example: Search for a function to "calculate the final price with tax"
python main.py -r "folder rep path" -q "function that calculate final price with tax"