Choosing a Vector Database in 2026 (Without the Hype)
pgvector, Qdrant, Pinecone, Weaviate, a decision framework
Every RAG tutorial reaches for a different vector database, and the marketing makes them all sound essential. They aren't. For most projects the choice matters far less than the people selling them suggest. Here's a no-hype framework for deciding.
Start with the boring option. If you already run Postgres, use pgvector until you have a measured reason not to. Most apps never outgrow it.
What You're Actually Choosing Between#
A vector database does three things: store embeddings, find nearest neighbours fast (approximate nearest neighbour, or ANN), and filter results by metadata. Where they differ is scale, filtering power, and operational cost, not whether they "do vectors."
The Contenders#
| Option | Model | Best when |
|---|---|---|
| pgvector | Postgres extension | You want one database; < ~1M vectors; rich SQL filtering |
| Qdrant | Dedicated, open-source | You need strong filtered search and self-hosting |
| Pinecone | Fully managed SaaS | You want zero ops and elastic scale, and will pay for it |
| Weaviate | Open-source + managed | You want hybrid (keyword + vector) search built in |
The Decision, Made Simple#
Ask three questions, in order:
-
Do I already run Postgres? If yes, try
pgvectorfirst. One system to operate, transactional consistency with your app data, and SQLWHEREclauses for filtering. The HNSW index added in recent versions handles millions of vectors comfortably. -
Is filtered search my bottleneck? If most queries are "find similar chunks where tenant = X and date > Y," a dedicated engine like Qdrant pays off, its payload indexing makes filtered ANN fast where naive approaches degrade.
-
Do I refuse to run infrastructure? Then managed Pinecone removes ops entirely. You trade money and lock-in for never thinking about index maintenance.
-- pgvector: similarity + metadata filter in one query
SELECT id, text
FROM chunks
WHERE tenant_id = $1
ORDER BY embedding <=> $2 -- cosine distance
LIMIT 5;The Metrics That Actually Matter#
Ignore benchmark leaderboards run on datasets unlike yours. Measure on your data:
- Recall@k: of the truly nearest neighbours, how many does the ANN index return? Tune index parameters (HNSW
ef_search,m) until recall is acceptable, then optimise speed. - p95 latency under filters: speed with no filter is irrelevant if your real queries are filtered. Test the filtered path.
- Cost at your volume: managed pricing scales with vectors and queries. Model it at 10× your current size before committing.
Don't migrate databases to chase a 5% recall gain. The embedding model and your chunking (see the Production RAG series) move retrieval quality far more than the vector store does.
When to Switch#
Outgrow pgvector when you hit a real wall: index build times that hurt, filtered-query latency you can't tune away, or vector counts in the tens of millions. At that point you have production data to benchmark against, and the migration becomes an evidence-based decision instead of a guess.
The honest summary: the vector database is rarely the hard part of RAG. Pick the one that adds the least operational surface area, measure recall and latency on your own data, and spend your real energy on embeddings, chunking, and evaluation, the things that actually move the needle.
Folarin Akinloye is an AI Engineer based in London, UK. He builds production-ready agentic AI systems, multi-agent architectures, and sophisticated RAG implementations, and writes about the engineering decisions behind them.