Features β’ Installation β’ Quick Start β’ Documentation β’ Architecture
A Python-based Model Context Protocol (MCP) server that gives LLMs persistent, hippocampus-inspired memory across sessions. Store, retrieve, consolidate, and forget memories using semantic similarity search powered by vector embeddings.
Why Hippocampus? Just like the human brain's hippocampus consolidates short-term memories into long-term storage, this server intelligently manages LLM memory through biological patterns:
- π Consolidation - Merge similar memories to reduce redundancy
- π§Ή Forgetting - Remove outdated information based on age/importance
- π Semantic Retrieval - Find relevant memories through meaning, not keywords
| Feature | Description |
|---|---|
| ποΈ Vector Storage | FAISS-powered semantic similarity search |
| π― MCP Compliant | Full MCP 1.2.0 spec compliance via FastMCP |
| 𧬠Bio-Inspired | Hippocampus-style consolidation and forgetting |
| π Security | Input validation, rate limiting, injection prevention |
| π Semantic Search | Sentence transformer embeddings (CPU-optimized) |
| βΎοΈ Unlimited Storage | No memory count limits, only per-item size limits |
| π 100% Free | Local embedding model - no API costs |
memory_read # π Retrieve memories by semantic similarity
memory_write # βοΈ Store new memories with tags & metadata
memory_consolidate # π Merge similar memories
memory_forget # π§Ή Remove memories by age/importance/tags
memory_stats # π Get system statisticspip install hippocampus-memory-mcpPrerequisites: Python 3.9+ β’ ~200MB disk space (for embedding model)
Add to your Claude Desktop config (claude_desktop_config.json):
{
"mcpServers": {
"memory": {
"command": "python",
"args": ["-m", "memory_mcp_server.server"]
}
}
}π That's it! Claude will now have persistent memory across conversations.
# Clone the repository
git clone https://github.com/jameslovespancakes/Memory-MCP.git
cd Memory-MCP
# Install dependencies
pip install -r requirements.txt
# Run the server
python -m memory_mcp_server.serverOnce connected to Claude, use natural language:
"Remember that I prefer Python for backend development"
β Claude calls memory_write()
"What do you know about my programming preferences?"
β Claude calls memory_read()
"Consolidate similar memories to clean up storage"
β Claude calls memory_consolidate()
from memory_mcp_server.storage import MemoryStorage
from memory_mcp_server.tools import MemoryTools
storage = MemoryStorage(storage_path="my_memory")
await storage._ensure_initialized()
tools = MemoryTools(storage)
# Store with tags and importance
await tools.memory_write(
text="User prefers dark mode UI",
tags=["preference", "ui"],
importance_score=3.0,
metadata={"category": "settings"}
)# Semantic search
result = await tools.memory_read(
query_text="What are my UI preferences?",
top_k=5,
min_similarity=0.3
)
# Filter by tags and date
result = await tools.memory_read(
query_text="Python learning",
tags=["learning", "python"],
date_range_start="2024-01-01"
)# Merge similar memories (threshold: 0.85)
result = await tools.memory_consolidate(similarity_threshold=0.85)
print(f"Merged {result['consolidated_groups']} groups")# Remove by age
await tools.memory_forget(max_age_days=30)
# Remove by importance
await tools.memory_forget(min_importance_score=2.0)
# Remove by tags
await tools.memory_forget(tags_to_forget=["temporary"])Run the included test suite:
python test_memory.pyThis tests all 5 operations with sample data.
βββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β MCP Client (Claude Desktop, etc.) β
βββββββββββββββββββββ¬ββββββββββββββββββββββββββββββββββ
β JSON-RPC over stdio
βββββββββββββββββββββΌββββββββββββββββββββββββββββββββββ
β FastMCP Server (server.py) β
β ββ memory_read β
β ββ memory_write β
β ββ memory_consolidate β
β ββ memory_forget β
β ββ memory_stats β
βββββββββββββββββββββ¬ββββββββββββββββββββββββββββββββββ
β
βββββββββββββββββββββΌββββββββββββββββββββββββββββββββββ
β Memory Tools (tools.py) β
β ββ Input validation & sanitization β
β ββ Rate limiting (100 req/min) β
βββββββββββββββββββββ¬ββββββββββββββββββββββββββββββββββ
β
βββββββββββββββββββββΌββββββββββββββββββββββββββββββββββ
β Storage Layer (storage.py) β
β ββ Sentence Transformers (all-MiniLM-L6-v2) β
β ββ FAISS Vector Index (cosine similarity) β
β ββ JSON persistence (memories.json) β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββ
| Step | Process | Technology |
|---|---|---|
| π Write | Text β 384-dim vector embedding | Sentence Transformers (CPU) |
| πΎ Store | Normalized vector β FAISS index | FAISS IndexFlatIP |
| π Search | Query β embedding β top-k similar | Cosine similarity |
| π Consolidate | Group similar (>0.85) β merge | Vector clustering |
| π§Ή Forget | Filter by age/importance/tags β delete | Metadata filtering |
| Protection | Implementation |
|---|---|
| π‘οΈ Injection Prevention | Regex filtering of script tags, eval(), path traversal |
| β±οΈ Rate Limiting | 100 requests per 60-second window per client |
| π Size Limits | 50KB text, 5KB metadata, 20 tags per memory |
| β Input Validation | Pydantic models + custom sanitization |
| π Safe Logging | stderr only (prevents JSON-RPC corruption) |
MEMORY_STORAGE_PATH="memory_data" # Storage directory
EMBEDDING_MODEL="all-MiniLM-L6-v2" # Model name
RATE_LIMIT_REQUESTS=100 # Max requests
RATE_LIMIT_WINDOW=60 # Time window (seconds)- β Unlimited total memories (no count limit)
β οΈ Per-memory limits: 50KB text, 5KB metadata, 20 tags
Model won't download
First run downloads all-MiniLM-L6-v2 (~90MB). Ensure internet connection and ~/.cache/ write permissions.
PyTorch compatibility errors
pip uninstall torch transformers sentence-transformers -y
pip install torch==2.1.0 transformers==4.35.2 sentence-transformers==2.2.2Memory errors on large operations
The model runs on CPU. Ensure 2GB+ free RAM. Reduce top_k in read operations if needed.
MIT License - feel free to use in your projects!
PRs welcome! Please:
- Follow MCP security guidelines
- Add tests for new features
- Update documentation
Built with π§ for persistent LLM memory