Introducing a new vector storage format: DiskBBQ

Introducing DiskBBQ, an alternative to HNSW, and exploring when and why to use it.

Test Elastic's leading-edge, out-of-the-box capabilities. Dive into our sample notebooks, start a free cloud trial, or try Elastic on your local machine now.

DiskBBQ is an evolution of an Inverted Vector File (IVF) index. It is an alternative to Hierarchical Navigable Small Worlds (HNSW), partitioning vectors into smaller clusters that can handle lower-memory scenarios more efficiently while providing nice query performance.

What’s wrong with HNSW?

We absolutely love HNSW. It’s a fast, compute-efficient algorithm that scales logarithmically to your vector data. However, that speed comes at a cost. For HNSW to work well, all the vectors need to reside in RAM. And while we have made this cheaper by adding increasingly better levels of quantization, we still have issues with vectors falling out of memory and performance plummeting.

Additionally, to index, you must search your existing HNSW graph. So all the memory costs that incur for RAM are also present during indexing. This can significantly slow down hardware that cannot fully hold the vectors in RAM.

How is DiskBBQ different?

DiskBBQ uses Hierarchical K-means to partition vectors into small clusters. It then selects representative centroids to query before querying the individual vectors themselves. It takes advantage of the multi-layer nature of this hierarchy to query at most 2 layers of the centroids, limiting the total explored space. Finally, it explores the vectors contained in each cluster by bulk scoring the distance between the cluster’s vectors and the query vector.

As the name implies, DiskBBQ also uses our BBQ (Better Binary Quantization) to reduce the size of the vectors and centroids. This allows loading many blocks of vectors into memory at a time for fast scoring, with very low memory requirements and low disk overhead.

Another interesting aspect of DiskBBQ is that it will allow vectors to be assigned to more than one centroid. It utilizes a version of Google’s Spilling with Orthogonality-Amplified Residuals (SOAR) to assign vectors to multiple clusters, which is particularly beneficial when a vector is near a border between two clusters. Because vectors are quantized heavily, this adds minimal disk overhead and requires fewer centroids to be explored during search.

Talk is cheap, show me some numbers

HNSW scales logarithmically. So, it must be many magnitudes faster than something that must search centroids and then score all vectors in each centroid, right?

We have been working hard to make DiskBBQ fully competitive with HNSW. While we are not yet 100% there in all situations, we are happy with the results.

DiskBBQ takes advantage of bulk scoring vectors and performs as many operations as it can off-heap. This means we can read vectors from files directly into memory for optimized vector operations, resulting in some pretty nice performance.

Two main scenarios for DiskBBQ are particularly interesting. The first is one where the entire index can fit in RAM. Since this is key for HNSW performance, it's only fair to see how DiskBBQ compares in this case.

TypeIndex TimeLatencyRecall
HNSW BBQ1,054,319ms~3.4ms92%
DiskBBQ94,075ms~4.0ms91%

Table 0. How DiskBBQ compares over 1M vectors. All sitting comfortably in memory. DiskBBQ can be a whopping 10x faster at indexing speed while almost being as fast as HNSW over BBQ quantized vectors.

Things get more interesting when memory is increasingly reduced to where the entire index no longer fits in memory. We will have a separate blog digging into why DiskBBQ is better in low-memory and our methodology for testing. However, here are some high-level numbers:

As overall memory decreases, DiskBBQ indexing degrades gracefully. HNSW, however, simply falls apart once memory becomes very restricted.

Again, we see DiskBBQ degrading gracefully in search latency. Indeed, it starts to slow down, but HNSW latency begins to increase exponentially as memory becomes increasingly restricted.

When should I use DiskBBQ?

DiskBBQ and HNSW will both continue to get improvements in Elasticsearch. For now, DiskBBQ’s performance for very high recall (99+%) and very low latency is not as good as HNSW. We hope to improve this while continuing to invest in HNSW.

If you need very, very high recall, have lots of off-heap memory (or are willing to pay for it), and have few index updates (so indexing costs are low), using HNSW with some form of quantization is likely still the best option.

However, if you are okay with 95% or less recall, are cost-sensitive, but still want fast search, DiskBBQ might be the solution for you.

How can I use DiskBBQ?

Here's how to enable DiskBBQ and start querying:

Set up a mapping like this:

Insert a vector:

Here’s an example of querying out that vector:

You can still use num_candidates to control approximation.

Here’s an example of querying out that vector:

Or, if you want more granular control over how many vectors the search will consider, you can set the visit_percentage directly.

Here’s an example of querying out that vector:

DiskBBQ will soon be available in Elasticsearch Serverless.

Related content

Ready to build state of the art search experiences?

Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want.

Try it yourself