Speed up OptimizedScalarQuantizer #131599

iverase · 2025-07-21T06:48:57Z

While reviewing the code in OptimizedScalarQuantizer, I noticed that we are quantizing the same vector a few times, once when we computing the loss and then again when computing the next grid points.I wondered if we could reuse the valu between those two calls and avoid that repeated computation.

This PR does that, it uses the destination array to keep the quantize value during the loss computation and give to the method computing the grid points. In addition we can skip the final quantization of the vector if the method that optimize the intervals finishes without computing a worst loss.

The only side effect is that we need to remove the legacy method on osq. That's ok as it was only used for benchmark comparison.

The results how a clear speed up in both, scalar and vector variants.

Current values with 128 bits preferred size:

Benchmark (bits) (dims) Mode Cnt Score Error Units OptimizedScalarQuantizerBenchmark.scalar 1 384 thrpt 15 139.486 ± 23.817 ops/ms OptimizedScalarQuantizerBenchmark.scalar 1 702 thrpt 15 79.059 ± 14.286 ops/ms OptimizedScalarQuantizerBenchmark.scalar 1 1024 thrpt 15 50.415 ± 7.558 ops/ms OptimizedScalarQuantizerBenchmark.scalar 4 384 thrpt 15 136.449 ± 21.873 ops/ms OptimizedScalarQuantizerBenchmark.scalar 4 702 thrpt 15 69.242 ± 15.013 ops/ms OptimizedScalarQuantizerBenchmark.scalar 4 1024 thrpt 15 43.425 ± 1.643 ops/ms OptimizedScalarQuantizerBenchmark.scalar 7 384 thrpt 15 149.420 ± 16.853 ops/ms OptimizedScalarQuantizerBenchmark.scalar 7 702 thrpt 15 77.437 ± 6.671 ops/ms OptimizedScalarQuantizerBenchmark.scalar 7 1024 thrpt 15 53.494 ± 7.536 ops/ms OptimizedScalarQuantizerBenchmark.vector 1 384 thrpt 15 562.416 ± 46.832 ops/ms OptimizedScalarQuantizerBenchmark.vector 1 702 thrpt 15 306.875 ± 47.434 ops/ms OptimizedScalarQuantizerBenchmark.vector 1 1024 thrpt 15 216.386 ± 26.207 ops/ms OptimizedScalarQuantizerBenchmark.vector 4 384 thrpt 15 509.608 ± 85.495 ops/ms OptimizedScalarQuantizerBenchmark.vector 4 702 thrpt 15 292.796 ± 55.263 ops/ms OptimizedScalarQuantizerBenchmark.vector 4 1024 thrpt 15 187.569 ± 15.714 ops/ms OptimizedScalarQuantizerBenchmark.vector 7 384 thrpt 15 539.447 ± 42.931 ops/ms OptimizedScalarQuantizerBenchmark.vector 7 702 thrpt 15 309.357 ± 27.685 ops/ms OptimizedScalarQuantizerBenchmark.vector 7 1024 thrpt 15 114.017 ± 71.001 ops/ms

With this PR:

Benchmark (bits) (dims) Mode Cnt Score Error Units OptimizedScalarQuantizerBenchmark.scalar 1 384 thrpt 15 169.414 ± 23.188 ops/ms OptimizedScalarQuantizerBenchmark.scalar 1 702 thrpt 15 87.899 ± 9.614 ops/ms OptimizedScalarQuantizerBenchmark.scalar 1 1024 thrpt 15 62.872 ± 10.971 ops/ms OptimizedScalarQuantizerBenchmark.scalar 4 384 thrpt 15 161.959 ± 31.947 ops/ms OptimizedScalarQuantizerBenchmark.scalar 4 702 thrpt 15 81.247 ± 6.511 ops/ms OptimizedScalarQuantizerBenchmark.scalar 4 1024 thrpt 15 58.583 ± 17.166 ops/ms OptimizedScalarQuantizerBenchmark.scalar 7 384 thrpt 15 181.835 ± 21.244 ops/ms OptimizedScalarQuantizerBenchmark.scalar 7 702 thrpt 15 97.614 ± 15.205 ops/ms OptimizedScalarQuantizerBenchmark.scalar 7 1024 thrpt 15 65.772 ± 9.829 ops/ms OptimizedScalarQuantizerBenchmark.vector 1 384 thrpt 15 638.882 ± 80.574 ops/ms OptimizedScalarQuantizerBenchmark.vector 1 702 thrpt 15 369.157 ± 44.456 ops/ms OptimizedScalarQuantizerBenchmark.vector 1 1024 thrpt 15 245.174 ± 31.757 ops/ms OptimizedScalarQuantizerBenchmark.vector 4 384 thrpt 15 615.784 ± 110.064 ops/ms OptimizedScalarQuantizerBenchmark.vector 4 702 thrpt 15 363.637 ± 82.684 ops/ms OptimizedScalarQuantizerBenchmark.vector 4 1024 thrpt 15 211.976 ± 12.900 ops/ms OptimizedScalarQuantizerBenchmark.vector 7 384 thrpt 15 686.756 ± 64.638 ops/ms OptimizedScalarQuantizerBenchmark.vector 7 702 thrpt 15 356.240 ± 37.930 ops/ms OptimizedScalarQuantizerBenchmark.vector 7 1024 thrpt 15 245.471 ± 6.831 ops/ms

elasticsearchmachine · 2025-07-21T06:49:22Z

Pinging @elastic/es-search-relevance (Team:Search Relevance)

elasticsearchmachine · 2025-07-21T06:49:22Z

Hi @iverase, I've created a changelog YAML for you.

benwtrent

This optimization makes sense to me.

We don't need to keep the legacy interface.

My only concern is making sure recall is unchanged. Looking at the code, all the paths already did a "Math.round" except now some of the paths are using int instead of rounding floats. Which is fine.

The speed ups are hilarious!

iverase · 2025-07-22T12:52:16Z

My only concern is making sure recall is unchanged.

I am pretty sure the new code is equivalent to the old one, we are just caching the results from the resulls of Math.round between function calls.

iverase added 2 commits July 21, 2025 07:34

Speed up OptimizedScalarQuantizer

dd69f08

iter

ea2c8d0

iverase requested review from benwtrent and john-wagster July 21, 2025 06:48

iverase added >enhancement :Search Relevance/Vectors Vector search v9.2.0 labels Jul 21, 2025

elasticsearchmachine added the Team:Search Relevance Meta label for the Search Relevance team in Elasticsearch label Jul 21, 2025

iverase and others added 5 commits July 21, 2025 07:49

Update docs/changelog/131599.yaml

1bfecf8

iter

15e873d

Merge branch 'main' into speed_osq

3699c46

iter

cf38523

iter

0e5c5cb

benwtrent approved these changes Jul 22, 2025

View reviewed changes

iverase merged commit 4468239 into elastic:main Jul 22, 2025
33 checks passed

iverase deleted the speed_osq branch July 22, 2025 13:25

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Speed up OptimizedScalarQuantizer #131599

Speed up OptimizedScalarQuantizer #131599

Uh oh!

iverase commented Jul 21, 2025

elasticsearchmachine commented Jul 21, 2025

elasticsearchmachine commented Jul 21, 2025

benwtrent left a comment

iverase commented Jul 22, 2025

Uh oh!

Labels

3 participants

Speed up OptimizedScalarQuantizer #131599

Speed up OptimizedScalarQuantizer #131599

Uh oh!

Conversation

iverase commented Jul 21, 2025

elasticsearchmachine commented Jul 21, 2025

elasticsearchmachine commented Jul 21, 2025

benwtrent left a comment

Choose a reason for hiding this comment

iverase commented Jul 22, 2025

Uh oh!

Labels

3 participants