Postgres as a Vector Database: How to Actually Choose Between HNSW, IVFFlat, and DiskANN
#Backend

Postgres as a Vector Database: How to Actually Choose Between HNSW, IVFFlat, and DiskANN

Startups Reporter
6 min read

Tiger Data, the team behind TimescaleDB, has published a practical guide to the index choices that decide whether vector search inside Postgres is fast, cheap, or accurate. You rarely get all three, and the right pick depends on how your data grows and how often it changes.

Storing embeddings in Postgres has gone from a clever hack to a default architecture. Instead of running a separate vector database next to their primary store, teams are increasingly keeping embeddings in the same Postgres instance that already holds their users, documents, and transactions. The appeal is obvious: one system to operate, one backup strategy, one place to run a join between a semantic search result and the structured metadata that gives it meaning.

The new guide from Tiger Data, the company formerly known as Timescale and the maintainers of TimescaleDB, walks through the part of this story that vendors usually skip. Putting vectors in Postgres is easy. Choosing the right index, and understanding what you give up with each one, is where most projects either succeed quietly or degrade into slow, expensive queries that nobody can explain.

Featured image

The problem every embedding workload runs into

A vector index exists to answer one question fast: given a query vector, which stored vectors are closest to it? Computing exact nearest neighbors means comparing the query against every row, which is fine for ten thousand embeddings and unusable for ten million. So nearly everyone reaches for approximate nearest neighbor (ANN) search, which trades a small amount of accuracy for an enormous gain in speed.

That trade is the whole game. An ANN index has a recall number, the fraction of true nearest neighbors it actually returns, and that number moves up or down depending on how the index is built and how aggressively you query it. The art is deciding how much recall you can sacrifice before your search results start feeling wrong to a user, and how much memory and build time you are willing to spend to avoid that.

pgvector, and the two indexes everyone starts with

Most Postgres vector work runs on pgvector, the extension that adds a vector column type and the index methods to search it. It ships with two index types, and the choice between them is the first real decision a developer makes.

IVFFlat partitions your vectors into a fixed number of lists using k-means clustering, then searches only the lists nearest to your query. It builds quickly and uses relatively little memory, which makes it attractive for large datasets. The catch is that it needs representative data to cluster well, so you should build it after loading a meaningful sample, not on an empty table. Its recall also depends on the probes setting at query time, and pushing recall higher means probing more lists, which costs latency. IVFFlat also handles updates poorly. As you insert vectors that do not match the original cluster structure, recall drifts downward until you rebuild.

HNSW, a hierarchical navigable small world graph, takes the opposite posture. It builds a multi-layer graph of connections between vectors and walks that graph at query time. Recall and latency are generally better than IVFFlat at the same dataset size, and it tolerates incremental inserts without the same decay. You pay for this in two ways: index build time is significantly longer, and memory consumption is high because the graph wants to live in RAM. For a few million vectors the index can run into many gigabytes, and once it spills past available memory, performance falls off a cliff.

The honest summary is that HNSW is the better default for most production read-heavy workloads, and IVFFlat earns its place when build speed and memory budget matter more than peak recall, or when the dataset is so large that the HNSW graph simply will not fit.

Tiger Data (creators of TimescaleDB)

Where pgvectorscale and DiskANN change the math

The memory ceiling on HNSW is exactly the wall that pgvectorscale, Tiger Data's open-source companion extension, is built to climb over. It adds a DiskANN-based index to Postgres, an approach pioneered by Microsoft Research that is designed to keep most of the index on SSD rather than demanding everything stay in RAM.

The practical consequence is cost. A DiskANN-style index lets a single Postgres node serve far more vectors before you are forced to scale up to a larger, more expensive instance just to fit the graph in memory. Tiger Data pairs this with a technique it calls streaming filtering, which addresses a genuinely annoying failure mode in vector search: filtered queries.

Consider a query that wants the nearest vectors that also satisfy a WHERE clause, say documents from a specific tenant or within a date range. A naive ANN search finds the nearest neighbors first and then filters, which can return far fewer rows than you asked for if the filter is selective, forcing the planner into awkward over-fetching. Streaming retrieval pulls candidates from the index incrementally and applies the filter as it goes, continuing until it has enough matching results. For applications with heavy metadata filtering, which is most real applications, this is the difference between a search feature that works and one that quietly returns half-empty result sets.

Hybrid search, because pure vectors are not enough

The guide spends real time on a point that experienced teams learn the hard way: semantic search alone is often worse than the keyword search it was supposed to replace. Vector similarity is excellent at meaning and terrible at exact matches. Search for a specific error code, a product SKU, or a person's name, and an embedding model will happily return things that are conceptually related but literally wrong.

The fix is hybrid search, combining vector similarity with traditional full-text ranking. Postgres has had full-text search for years, and newer extensions push it toward the BM25 ranking that information-retrieval people actually want, the same algorithm that powers Elasticsearch relevance. Blending the two scores, usually with a method like reciprocal rank fusion, gives you results that respect both meaning and exact terms. Keeping both retrieval methods inside one Postgres query, rather than stitching together a vector database and a separate search engine in application code, is the strongest argument for the whole Postgres-as-vector-store approach.

How to actually decide

The guide resists handing out a single answer, which is the right instinct, but the decision tree underneath it is reasonably clear.

If your dataset is small to medium and read-heavy, start with HNSW in pgvector and tune the ef_search parameter to find your recall and latency balance. If build time or memory pressure becomes the bottleneck, or your vector count climbs into the tens of millions, look at pgvectorscale's DiskANN index to keep a single node viable. If you do heavy filtered search, the streaming approach matters more than raw index speed. And if exact-match queries are part of your workload, plan for hybrid search from the start rather than bolting it on after users complain.

What makes this guide useful is that it treats vector indexing as an engineering tradeoff rather than a feature checkbox. Recall, latency, memory, build time, and update tolerance pull against each other, and no index wins on all five. The teams that get good results are the ones who measure their own data instead of trusting a benchmark run on someone else's. For developers already running Postgres, the message is encouraging: you probably do not need a second database, but you do need to choose your index deliberately. The full write-up and the relevant extensions are available through Tiger Data's documentation and the pgvector project on GitHub.

Comments

Loading comments...