The most advanced TypeScript MCP server for Qdrant with multi-client isolation, LM Studio integration, and enterprise-grade document processing
This is the ultimate evolution of RAG (Retrieval-Augmented Generation) systems, combining the best practices from:
- lance-mcp architecture & document processing
- sqlite-vss-mcp performance optimizations & concurrency
- delorenj/mcp-qdrant-memory TypeScript foundation & MCP integration
Result: A production-ready, multi-tenant RAG system with client isolation, advanced seeding, and LM Studio integration.
- Complete isolation between clients - perfect for agencies, consultants, or organizations managing multiple projects
- Separate collections for each client:
{client}_catalog+{client}_chunks - Privacy-first design for sensitive documents
- BGE-M3 embeddings (1024 dimensions) for semantic search
- Qwen3-8B summaries for document overviews
- Zero cloud dependency - everything runs locally for maximum privacy
- SHA256 deduplication - never process the same document twice (90%+ time savings on updates)
- Multi-format support - PDF, Markdown, TXT, DOCX
- Incremental updates - only process changed files
- Batch processing - efficient API usage with p-limit concurrency control
- Semantic catalog search - find documents by meaning, not just keywords
- Granular chunk search - search within specific documents
- Cross-client search - find information across all clients
- Rich metadata - source tracking, chunk indexing, similarity scores
# Install globally for easy project setup
npm install -g claude-qdrant-mcp
# Create new project
mkdir my-rag-project
cd my-rag-project
qdrant-setup
# Or use the interactive setup
npm run setup# Install in existing project
npm install claude-qdrant-mcp
# Run interactive setup
npx qdrant-setupβ
Dependency Check - Verifies Node.js, Qdrant, and LM Studio
β
Environment Config - Interactive .env file creation
β
Claude Desktop Integration - Automatic MCP server configuration
β
Sample Documents - Creates test files for immediate use
β
Connection Testing - Validates all services are working
# Complete setup and test in one go
npm install -g claude-qdrant-mcp && \
mkdir my-rag && cd my-rag && \
qdrant-setup && \
npm run test-connectionAfter installation, you have access to:
# Interactive setup wizard
qdrant-setup
# Test all connections
npm run test-connection
# Seed documents
npm run seed -- --client work --filesdir ./documents
# Start MCP server
npm start
# Development mode
npm run watch- π What is This?
- β‘ Key Features
- π Quick Install via NPM
- π οΈ Manual Installation & Setup
- π LM Studio Setup
- π Usage Examples
- ποΈ Architecture Deep Dive
- π― Performance & Scalability
- π Troubleshooting
- π Development
- π Migration from Other Systems
- π Privacy & Security
- π£οΈ Roadmap
- π Documentation
- π€ Contributing
- π License
- π Acknowledgments
- π Support
- Node.js 18+
- LM Studio running locally with BGE-M3 + Qwen3 models
- Qdrant server (local Docker or Qdrant Cloud)
# Clone the repository
git clone https://github.com/marlian/claude-qdrant-mcp.git
cd claude-qdrant-mcp
# Install dependencies
npm install
# Setup environment
cp .env.example .env
# Edit .env with your configuration
# Build the project
npm run build
# Test with help
npm run seed -- --helpCreate a .env file with your settings:
# Qdrant Configuration
QDRANT_URL=http://localhost:6333
QDRANT_API_KEY=your-api-key-if-using-cloud
# LM Studio Configuration
LM_STUDIO_URL=http://127.0.0.1:1235
EMBEDDING_MODEL=text-embedding-finetuned-bge-m3
EMBEDDING_DIM=1024
LLM_MODEL=qwen/qwen3-8b
# Multi-Client Setup (customize with your client names)
CLIENT_COLLECTIONS=client_a,client_b,personal,work,research
# Performance Tuning
CONCURRENCY=5
BATCH_SIZE=10
CHUNK_SIZE=500
CHUNK_OVERLAP=10
DEBUG=false-
BGE-M3 Embedding Model
- Download from LM Studio model library
- Model name:
text-embedding-finetuned-bge-m3 - Purpose: Generate 1024-dim embeddings for semantic search
-
Qwen3-8B Chat Model
- Download from LM Studio model library
- Model name:
qwen/qwen3-8b - Purpose: Generate document summaries
- Start LM Studio
- Load both models
- Start the server (default port 1235)
- Verify connection:
curl http://127.0.0.1:1235/v1/models
# Seed documents for specific client
npm run seed -- --client work --filesdir /path/to/work/documents
# Force overwrite existing data (full reprocessing)
npm run seed -- --client personal --filesdir /path/to/personal/docs --overwrite
# Validate documents without seeding
npm run seed -- --client research --filesdir /path/to/research/docs --validate-only
# Debug mode for troubleshooting
npm run seed -- --client client_a --filesdir /path/to/docs --debug# Run the MCP server
npm start
# Or in development mode with watch
npm run watchAdd to your claude_desktop_config.json:
{
"mcpServers": {
"qdrant-rag": {
"command": "node",
"args": ["/absolute/path/to/claude-qdrant-mcp/dist/index.js"],
"env": {
"QDRANT_URL": "http://localhost:6333",
"QDRANT_API_KEY": "your-api-key-if-needed",
"CLIENT_COLLECTIONS": "work,personal,research"
}
}
}
}Get status of all collections and clients.
// No parameters needed
collection_info()
// Returns: Collection stats, client list, system statusSearch document summaries for a specific client.
{
"query": "quarterly business strategy",
"client": "work",
"limit": 10
}Search document chunks with optional source filtering.
{
"query": "machine learning implementation",
"client": "research",
"source": "/path/to/specific/document.md", // optional
"limit": 5
}Search across all clients and collections.
{
"query": "project management best practices",
"limit": 20
}Qdrant Collections:
βββ work_catalog # Document summaries for work
βββ work_chunks # Document chunks for work
βββ personal_catalog # Document summaries for personal
βββ personal_chunks # Document chunks for personal
βββ research_catalog # Document summaries for research
βββ research_chunks # Document chunks for research
βββ ... (per client)
Documents β Hash Check β Content Extract β LM Summary β
Chunk Split β BGE-M3 Embed β Batch Process β Qdrant Store β MCP Search
- Directory Scan - Find all supported documents (.pdf, .md, .txt, .docx)
- Hash Validation - SHA256 deduplication (skip unchanged files)
- Content Processing - Extract text using appropriate parsers
- Summary Generation - LM Studio Qwen3 creates document overviews
- Chunk Creation - Split documents with configurable overlap
- Batch Embedding - BGE-M3 vectorization in efficient batches
- Qdrant Storage - Dual collection storage (catalog + chunks)
- Concurrency Control - p-limit prevents API overload
- Batch Processing - Multiple embeddings per API call
- Smart Caching - SHA256 prevents duplicate processing
- Memory Efficient - Streaming document processing
- Error Recovery - Graceful handling of failures
| Metric | Performance | Notes |
|---|---|---|
| Documents/minute | 50-100 | Depends on document size and LM Studio performance |
| Memory usage | 100-500MB | During processing, minimal at rest |
| Search latency | <200ms | Average semantic search response time |
| Concurrency | 5 parallel | Configurable based on system resources |
| Hash optimization | 90%+ savings | On incremental updates |
- Multi-client isolation - No data leakage between clients
- Horizontal scaling - Add more Qdrant nodes as needed
- Local-first - No external API dependencies or costs
- Incremental processing - Only process changed documents
β "LM Studio connection failed"
# Check LM Studio is running
curl http://127.0.0.1:1235/v1/models
# Verify models are loaded
# BGE-M3 for embeddings, Qwen3 for summariesβ "Qdrant connection failed"
# Check Qdrant server (local)
curl http://localhost:6333/collections
# Check Qdrant Cloud with API key
curl -H "api-key: YOUR_KEY" https://your-cluster.qdrant.io/collectionsβ "No documents found"
# Check file path exists and contains supported formats
ls -la /path/to/documents
# Verify supported file types (.pdf, .md, .txt, .docx)
find /path/to/documents -name "*.md" -o -name "*.pdf" -o -name "*.txt" -o -name "*.docx"Enable comprehensive logging:
export DEBUG=true
npm run seed -- --client test --filesdir ./sample-docs --debugsrc/
βββ config.ts # Enhanced configuration system
βββ types.ts # RAG document types & interfaces
βββ index.ts # MCP server & tool handlers
βββ seed.ts # Ultimate document processing engine
βββ persistence/
β βββ qdrant.ts # Multi-collection Qdrant client
βββ validation.ts # Input validation & safety
# Development build
npm run build
# Watch mode for development
npm run watch
# Test processing without modifying database
npm run seed -- --validate-only --client test --filesdir ./test-docs- Update
CLIENT_COLLECTIONSin.env - Run seed command with new client name
- Collections are created automatically
- Collections replace single database files
- Enhanced config replaces hardcoded settings
- Multi-client replaces single-tenant approach
- Cloud sync replaces local-only storage
- Qdrant replaces SQLite + VSS for better performance
- TypeScript replaces Python implementation
- MCP integration replaces custom API
- RAG document model replaces knowledge graph entities
- LM Studio replaces OpenAI for cost-free local processing
- Multi-collection replaces single collection architecture
- Local-first processing - Documents never leave your machine
- Client isolation - Complete data separation between clients
- No external APIs - LM Studio runs entirely offline
- Hash-based deduplication - Secure content fingerprinting
- Configurable storage - Use local Qdrant or secure cloud instances
- Web UI for collection management and search
- Additional embedding models (support for other local models)
- Advanced chunking strategies (semantic splitting)
- Hybrid search (combine vector + keyword search)
- Export/import collections for backup and sharing
- Obsidian plugin for direct vault integration
- API server mode for external applications
- Batch processing for large document sets
- Real-time file watching for automatic updates
Looking for deeper details, integrations or low-level references?
Check out the full documentation under /docs:
- π§ Claude Project Instructions β AI agent behavior and search workflows
- π₯οΈ Claude Desktop Integration β Setup guide for local LM Studio
- βοΈ Advanced Configuration β Power user setup and tuning
- π MCP Tools Reference β Tool descriptions, parameters, and examples
- Setup guides for LM Studio, Qdrant, and Claude Desktop integration
- Performance benchmarks and optimization tips
- Troubleshooting guides for common issues
- API reference for all MCP tools
- Best practices for multi-client setups
This project combines the best ideas from multiple RAG implementations. Contributions welcome for:
- Performance optimizations
- Additional document formats
- Enhanced search capabilities
- New embedding models support
- UI/dashboard development
- Documentation improvements
- Fork the repository
- Create a feature branch
- Make your changes with tests
- Submit a pull request with detailed description
MIT License - Use freely for personal and commercial projects.
Built upon the excellent work of:
- lance-mcp - Document processing architecture inspiration
- sqlite-vss-mcp - Performance optimization patterns
- delorenj/mcp-qdrant-memory - TypeScript MCP foundation
- Qdrant - Vector search engine
- LM Studio - Local LLM hosting platform
- BGE-M3 - Multilingual embedding model
- Qwen3 - Document summarization model
- GitHub Issues - Bug reports and feature requests
- GitHub Discussions - Questions and community support
- Documentation - Comprehensive guides and references
For detailed API documentation, see MCP Tools Reference. For advanced setup, see Advanced Configuration.
π― The most advanced TypeScript RAG system with enterprise-grade features, multi-client isolation, and local-first privacy.