Best Memory and Knowledge Graph MCP Servers for 2026: 6 Tools Ranked
Six memory and vector MCP servers ranked: Knowledge Graph, Mem0, Chroma, Qdrant, Pinecone, Weaviate. Pick the right one for episodic memory or RAG.

Knowledge Graph MCP is the free default for personal agents. Mem0 wins for production user memory with built-in conflict resolution. Chroma is the simplest local vector store for RAG. Qdrant scales further when you need it. Pinecone is zero-ops serverless. Weaviate adds built-in embeddings plus hybrid search. Real products run a memory MCP and a vector MCP side by side.
- What Is a Memory MCP Server?
- How We Ranked Them
- The 6 Memory and Knowledge MCP Servers Worth Installing
- 1. Knowledge Graph MCP — Built-In, Local, Zero Cost
- 2. Mem0 MCP — Hosted Agent Memory With Conflict Resolution
- 3. Chroma MCP — Embedded Vector Store for RAG
- 4. Qdrant MCP — Production-Grade Vector Search
- 5. Pinecone MCP — Managed Vector DB at Scale
- 6. Weaviate MCP — Hybrid Search With Built-In Embeddings
- Quick Comparison
- Memory vs Knowledge Graph vs Vector: Which Pattern Do You Actually Need?
- Memory MCP Pitfalls That Will Bite You
- Frequently Asked Questions
What Is a Memory MCP Server?
A memory MCP server is a Model Context Protocol bridge that gives an AI agent persistent memory across conversations. Without it, every Claude or Cursor session starts blank — the agent has no idea who you are, what project you are working on, what you decided yesterday, or what files matter. With a memory MCP wired in, the agent can write durable facts to a backing store, retrieve them at the start of new sessions, and update them as the world changes.
The category splits into two design centres. Memory-first servers (Knowledge Graph MCP, Mem0) are purpose-built for episodic agent state — user preferences, project context, conversation history — and they handle dedup, recency, and conflict resolution for you. Vector-database MCPs (Chroma, Qdrant, Pinecone, Weaviate) are general-purpose similarity-search engines that store anything you can embed. You retrieve documents by semantic similarity rather than by exact match.
Real production agents typically use both: a memory MCP for who-you-are-and-what-you-want state, and a vector MCP for the company knowledge base or document corpus. The six servers below cover both camps, and the section on choosing between them sits at the bottom of this guide.
How We Ranked Them
We installed each MCP server, wired it to Claude Desktop and Claude Code, and ran the same five-task evaluation: persist a user preference across three sessions, store and retrieve project decisions, ingest a 100-document knowledge base for RAG, retrieve relevant chunks for a complex question, and handle a conflicting fact written six months apart. The rankings reflect:
- Setup time — minutes from install to first successful write/read round trip
- Recall quality — how often the right fact came back for an ambiguous query
- Cost at production scale — recurring spend for a 10k-document, 1k-user workload
- Embedding ergonomics — whether the server handles embeddings or makes you wire them yourself
- Conflict handling — does the server help with stale facts, or does it leave that to you
- Ops burden — how much engineering time it takes to keep running
The 6 Memory and Knowledge MCP Servers Worth Installing
1. Knowledge Graph MCP — Built-In, Local, Zero Cost
Best for: personal agents, prototypes, anyone who wants memory without ops.
The Knowledge Graph MCP server is the simplest possible memory layer: a local JSON file with entities, observations about entities, and relations between them. The agent can call create_entities, add_observations, create_relations, and search_nodes. The whole thing runs locally with no external service required — your memory is just a file at a known path.
What makes it the default first install is that it ships in the official Anthropic MCP server collection, has zero ops, and the JSON format is human-readable. You can inspect and edit your agent's memory in any text editor, back it up with the rest of your dotfiles, and version it with git if you want. For a single-user personal agent setup, this is genuinely all the memory layer you need.
Install: npx -y @modelcontextprotocol/server-memory in your Claude Desktop config. Two minutes. Storage path defaults to your home directory.
The catch: no automatic conflict resolution, no built-in embeddings, no multi-user story. When you have two users on the same agent, they will share one knowledge graph unless you partition the file paths manually. Outgrow it the moment you ship to real users.
2. Mem0 MCP — Hosted Agent Memory With Conflict Resolution
Best for: production agents serving multiple users, products that need durable state.
Mem0 MCP is purpose-built for the "agent that needs to remember stuff about its users" problem. The hosted service handles embeddings, recency-weighted retrieval, conflict resolution between contradicting facts, and per-user scoping out of the box. The MCP layer exposes simple tools: add a memory, search by query, get_all for a user, delete a specific memory.
What earns Mem0 the second slot is the conflict-resolution story. When the agent writes "user prefers Postgres" today and "user prefers Supabase" next month, Mem0 keeps the recent one as authoritative and flags the old one as superseded. The agent retrieves the current truth without you having to write that logic. For products where users change their minds, this is the difference between an agent that learns and one that drowns in stale data.
Install: sign up at mem0.ai, generate an API key, plug it into the MCP config. Five minutes including signup. Self-hosted Mem0 also runs in Docker if you do not want hosted dependencies.
The catch: Mem0 is opinionated about what memory looks like. If you want a vector store for arbitrary documents, this is not it — pair Mem0 with Chroma or Qdrant for the RAG layer. Mem0 handles the "remember user facts" job; vector stores handle the "search the knowledge base" job.
3. Chroma MCP — Embedded Vector Store for RAG
Best for: RAG over local document corpora, dev-laptop knowledge bases.
Chroma MCP exposes Chroma, the embedded vector database, as MCP tools. The agent can create collections, upsert documents (with or without your own embeddings — Chroma defaults to sentence-transformers locally), and query by semantic similarity. It runs in-process by default, which means zero external dependencies and zero hosting cost.
The killer use case is local RAG. Drop a few hundred markdown files into a collection, and Claude can semantic-search across them as easily as it can grep a directory. The agent gets back the most relevant chunks for any question, with similarity scores, and you stay in control of the source material. For internal documentation, personal knowledge bases, and prototype RAG systems, Chroma plus its MCP is the cleanest path.
Install: pip install chromadb plus the Chroma MCP package, point at a local persistence directory. Three minutes.
The catch: Chroma scales gracefully up to about a million documents and then starts to need ops attention. For larger corpora, move to Qdrant or Pinecone. For multi-machine teams, run Chroma in client/server mode rather than embedded.
4. Qdrant MCP — Production-Grade Vector Search
Best for: production RAG, multi-tenant vector workloads.
Qdrant MCP connects to a Qdrant cluster (managed Qdrant Cloud or self-hosted) and exposes collection management, upserts, hybrid keyword-plus-vector search, and payload filtering. The Rust-based engine handles billions of points and gives you payload-based filters — "find chunks similar to X where source_type = handbook" — that pure-vector stores cannot match cleanly.
Where Qdrant earns its slot above Chroma is production readiness. It has snapshots, replication, point-in-time recovery, and a clear scaling story from one node to a multi-region cluster. The MCP layer is a thin wrapper that does not get in your way — you keep your existing Qdrant collections and just expose them to the agent.
Install: Qdrant Cloud cluster URL plus API key, drop into MCP config. Self-host with one Docker command if you prefer.
The catch: you still need to handle embeddings yourself. Wire up an OpenAI, Voyage, or Cohere embedding call before upsert. Qdrant does not embed for you.
Browse 75,000+ MCP servers, skills, and agents on Skiln
Browse Now →5. Pinecone MCP — Managed Vector DB at Scale
Best for: serverless RAG, teams that want zero ops on the vector layer.
Pinecone MCP sits in front of Pinecone's managed serverless vector database. The agent gets the same primitives — create index, upsert vectors, query by similarity, filter by metadata — without you running any infrastructure. Pinecone bills per read/write rather than per node, which makes it cost-effective for spiky workloads.
Pinecone's strength is operational simplicity. You write a vector and metadata, you query, you get results. No node sizing, no replica configuration, no shard count to tune. For teams that want to focus on the agent and not the database, Pinecone is the cleanest choice.
Install: Pinecone API key, index name, MCP server URL. Three minutes.
The catch: cost at scale. Pinecone's serverless pricing is great for low-to-moderate volume but expensive once you hit millions of QPS or store hundreds of millions of vectors. At that scale Qdrant or Weaviate self-hosted wins on raw economics.
6. Weaviate MCP — Hybrid Search With Built-In Embeddings
Best for: teams that want vector plus keyword search and zero embedding plumbing.
Weaviate is the only vector database on this list with built-in embedding modules. Hand it a document, and it calls OpenAI, Cohere, Hugging Face, or a local model to generate the embedding for you. The MCP server exposes class management, object upserts, vector queries, and Weaviate's signature hybrid search (BM25 keyword plus vector, fused by reciprocal rank).
Hybrid search is the underrated capability. Pure vector search misses exact-match queries (product SKUs, error codes, function names); pure keyword misses semantic intent. Weaviate's hybrid pipeline handles both, and the MCP layer makes it trivial for the agent to use. For knowledge bases with a mix of structured and unstructured content, hybrid search beats pure vector.
Install: Weaviate Cloud Services cluster URL plus API key, or self-host with one Docker compose. Configure the embedding module of your choice.
The catch: Weaviate has more knobs than Chroma or Pinecone. The schema model is more rigid (you predefine classes and properties), and the configuration surface is larger. Worth it for hybrid search, overkill if you only want vector similarity.
Quick Comparison
Memory vs Knowledge Graph vs Vector: Which Pattern Do You Actually Need?
The hardest decision in this category is not "which MCP server" but "what shape of memory do I need?" Three patterns cover most cases:
- Episodic user memory. You want the agent to remember user names, preferences, project context, and decisions. Use Mem0 (production) or Knowledge Graph MCP (personal). Conflict resolution and recency matter more than raw recall volume.
- Knowledge-base RAG. You want the agent to answer questions over a fixed corpus — internal docs, product manuals, support tickets. Use Chroma (laptop), Qdrant or Weaviate (production), or Pinecone (zero ops). Embedding quality and chunking strategy matter more than the database itself.
- Hybrid. You need both — durable user state plus a knowledge base. Run two MCP servers: Mem0 for the user state, one of the vector stores for the corpus. The agent will pick the right one for each query if the tool descriptions are clear.
If you are building a customer-facing agent product, plan to ship hybrid from day one. The MCP protocol makes it cheap to register multiple servers, and the LLM is good at choosing between them once you describe each tool's purpose in one sentence. For the configuration patterns that scale to many MCP servers without overwhelming the agent, see Skiln's best MCP clients guide.
Memory MCP Pitfalls That Will Bite You
Writing too much. The most common mistake is instructing the agent to "remember everything important". It will. Within a week the memory store is bloated with low-value facts that drown out the high-value ones at recall time. Be explicit: "remember user preferences and major project decisions, nothing else".
No periodic compaction. Knowledge Graph MCP and raw vector stores never forget. After six months, the agent is wading through stale facts. Schedule a quarterly "memory compaction" prompt where the agent reviews and prunes its own memory. Mem0 handles this automatically; the others do not.
Shared embeddings across user tenants. If you serve multiple users from one Pinecone index, you must filter by user_id on every query. Forget once and one user sees another user's data. Use Mem0's per-user scoping, Qdrant's tenancy, or partition by collection name on the simpler stores.
Embedding model drift. Switching from OpenAI's text-embedding-3-small to text-embedding-3-large changes the vector space. Old vectors are no longer compatible with new queries. Either reindex everything when you change models, or version your collections by embedding model.
Stale RAG corpora. A vector store with a static dump of internal docs gets less useful every week. Wire up an ingestion pipeline that re-syncs from the source of truth nightly. Skiln's documentation MCP guide covers the keep-it-fresh patterns.
Treating memory as logs. A memory MCP is not an audit log. If you need to know when a decision was made and by whom, store that in a real database, not the agent's working memory. The memory MCP holds working facts; provenance lives elsewhere.
Frequently Asked Questions
See the FAQ section below for answers to the most common questions about memory and knowledge graph MCP servers in 2026.