- Notifications
You must be signed in to change notification settings - Fork 25.6k
Description
The upcoming Lucene 9.7.0 release has support for SIMD vectorized implementations of the low-level primitives used by Vector Search. The vectorized implementations use the currently Incubating Panama Vector API, see apache/lucene#12311 apache/lucene#12327. The Lucene changelog notes say it all [1]
We should evaluate the impact of enabling this in Elasticsearch. Specifically,
-
Refactor similar usages in Elasticsearch to use the Lucene VectorUtil functions - since they are much faster, e.g.
DenseVectorFieldMapper.javacan useVectorUtil::dotProduct(rather than its own slower scalar implementation). 96617 -
Merge Lucene 9.7.0 without enabling the new Panamaized vectorized implementations. We want to validate and baseline the upgrade to 9.7.0 independently of this change. Allow, say 24+ hours, to get at least one nightly benchmark run.
- Upgrade to 9.7.0 snapshot
- Evaluate nightly benchmarks
-
Add
--add-modules jdk.incubator.vectorto the Elasticsearch startup - this will enable the faster Lucene VectorUtil implementation. https://github.com/elastic/elasticsearch/blob/main/distribution/tools/server-cli/src/main/java/org/elasticsearch/server/cli/ServerProcess.java#L222. This will raise a warning at startup, document that this is ok (similar to the security manager warning - yes, we know, it is ok! )- conditionally add the module, only for JDK 20+ Enable the Panama Vector module #96453
- check JVM flags, see later comment, test environments, etc
- check log output contains the expected Vector bit width Enable the Panama Vector module #96453 (comment)
[1] GITHUB#12302, GITHUB#12311: Add vectorized implementations of VectorUtil.dotProduct(), squareDistance(), cosine() with Java 20 jdk.incubator.vector APIs. Applications started with command line parameter "java --add-modules jdk.incubator.vector" on exactly Java 20 will automatically use the new vectorized implementations if running on a supported platform (x86 AVX2 or later, ARM SVE or later). This is an opt-in feature and requires explicit Java command line flag! When enabled, Lucene logs a notice using java.util.logging. Please test thoroughly and report bugs/slowness to Lucene's mailing list.