Choosing a Vector Database in 2026 (Without the Hype), Folarin Akinloye

Every RAG tutorial reaches for a different vector database, and the marketing makes them all sound essential. They aren't. For most projects the choice matters far less than the people selling them suggest. Here's a no-hype framework for deciding.

Important

Start with the boring option. If you already run Postgres, use pgvector until you have a measured reason not to. Most apps never outgrow it.

What You're Actually Choosing Between#

A vector database does three things: store embeddings, find nearest neighbours fast (approximate nearest neighbour, or ANN), and filter results by metadata. Where they differ is scale, filtering power, and operational cost, not whether they "do vectors."

The Contenders#

Option	Model	Best when
pgvector	Postgres extension	You want one database; < ~1M vectors; rich SQL filtering
Qdrant	Dedicated, open-source	You need strong filtered search and self-hosting
Pinecone	Fully managed SaaS	You want zero ops and elastic scale, and will pay for it
Weaviate	Open-source + managed	You want hybrid (keyword + vector) search built in

The Decision, Made Simple#

Ask three questions, in order:

Do I already run Postgres? If yes, try pgvector first. One system to operate, transactional consistency with your app data, and SQL WHERE clauses for filtering. The HNSW index added in recent versions handles millions of vectors comfortably.
Is filtered search my bottleneck? If most queries are "find similar chunks where tenant = X and date > Y," a dedicated engine like Qdrant pays off, its payload indexing makes filtered ANN fast where naive approaches degrade.
Do I refuse to run infrastructure? Then managed Pinecone removes ops entirely. You trade money and lock-in for never thinking about index maintenance.

-- pgvector: similarity + metadata filter in one query
SELECT id, text
FROM chunks
WHERE tenant_id = $1
ORDER BY embedding <=> $2   -- cosine distance
LIMIT 5;

The Metrics That Actually Matter#

Ignore benchmark leaderboards run on datasets unlike yours. Measure on your data:

Recall@k: of the truly nearest neighbours, how many does the ANN index return? Tune index parameters (HNSW ef_search, m) until recall is acceptable, then optimise speed.
p95 latency under filters: speed with no filter is irrelevant if your real queries are filtered. Test the filtered path.
Cost at your volume: managed pricing scales with vectors and queries. Model it at 10× your current size before committing.

Warning

Don't migrate databases to chase a 5% recall gain. The embedding model and your chunking (see the Production RAG series) move retrieval quality far more than the vector store does.

When to Switch#

Outgrow pgvector when you hit a real wall: index build times that hurt, filtered-query latency you can't tune away, or vector counts in the tens of millions. At that point you have production data to benchmark against, and the migration becomes an evidence-based decision instead of a guess.

The honest summary: the vector database is rarely the hard part of RAG. Pick the one that adds the least operational surface area, measure recall and latency on your own data, and spend your real energy on embeddings, chunking, and evaluation, the things that actually move the needle.

The Contenders#

Option

Model

Best when

pgvector

Postgres extension

You want one database; < ~1M vectors; rich SQL filtering

Qdrant

Dedicated, open-source

You need strong filtered search and self-hosting

Pinecone

Fully managed SaaS

You want zero ops and elastic scale, and will pay for it

Weaviate

Open-source + managed

You want hybrid (keyword + vector) search built in

The Decision, Made Simple#

Ask three questions, in order:

Do I already run Postgres? If yes, try pgvector first. One system to operate, transactional consistency with your app data, and SQL WHERE clauses for filtering. The HNSW index added in recent versions handles millions of vectors comfortably.

Is filtered search my bottleneck? If most queries are "find similar chunks where tenant = X and date > Y," a dedicated engine like Qdrant pays off, its payload indexing makes filtered ANN fast where naive approaches degrade.

Do I refuse to run infrastructure? Then managed Pinecone removes ops entirely. You trade money and lock-in for never thinking about index maintenance.

-- pgvector: similarity + metadata filter in one query
SELECT id, text
FROM chunks
WHERE tenant_id = $1
ORDER BY embedding <=> $2   -- cosine distance
LIMIT 5;

The Metrics That Actually Matter#

Ignore benchmark leaderboards run on datasets unlike yours. Measure on your data:

Recall@k: of the truly nearest neighbours, how many does the ANN index return? Tune index parameters (HNSW ef_search, m) until recall is acceptable, then optimise speed.

p95 latency under filters: speed with no filter is irrelevant if your real queries are filtered. Test the filtered path.

Cost at your volume: managed pricing scales with vectors and queries. Model it at 10× your current size before committing.

Warning

Don't migrate databases to chase a 5% recall gain. The embedding model and your chunking (see the Production RAG series) move retrieval quality far more than the vector store does.

When to Switch#

Choosing a Vector Database in 2026 (Without the Hype)

What You're Actually Choosing Between#

The Contenders#

The Decision, Made Simple#

The Metrics That Actually Matter#

When to Switch#

Related articles

Production RAG, Part 1: Chunking That Actually Works

Agentic AI from Scratch, Part 2: Multi-Agent Orchestration

Production RAG, Part 2: Measuring Retrieval Quality

Choosing a Vector Database in 2026 (Without the Hype)

What You're Actually Choosing Between#

The Contenders#

The Decision, Made Simple#

The Metrics That Actually Matter#

When to Switch#

Related articles

Production RAG, Part 1: Chunking That Actually Works

Agentic AI from Scratch, Part 2: Multi-Agent Orchestration

Production RAG, Part 2: Measuring Retrieval Quality