- Notifications
You must be signed in to change notification settings - Fork 25.5k
Add mapper for exponential histograms #132493
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
9a70b1d
to f48361b
Compare Pinging @elastic/es-storage-engine (Team:StorageEngine) |
...c/main/java/org/elasticsearch/xpack/exponentialhistogram/CompressedExponentialHistogram.java Outdated Show resolved Hide resolved
...c/main/java/org/elasticsearch/xpack/exponentialhistogram/CompressedExponentialHistogram.java Outdated Show resolved Hide resolved
...c/main/java/org/elasticsearch/xpack/exponentialhistogram/CompressedExponentialHistogram.java Outdated Show resolved Hide resolved
...c/main/java/org/elasticsearch/xpack/exponentialhistogram/CompressedExponentialHistogram.java Outdated Show resolved Hide resolved
...c/main/java/org/elasticsearch/xpack/exponentialhistogram/CompressedExponentialHistogram.java Outdated Show resolved Hide resolved
...c/main/java/org/elasticsearch/xpack/exponentialhistogram/CompressedExponentialHistogram.java Outdated Show resolved Hide resolved
# Conflicts: # benchmarks/src/main/java/org/elasticsearch/benchmark/exponentialhistogram/ExponentialHistogramMergeBench.java # libs/exponential-histogram/src/main/java/org/elasticsearch/exponentialhistogram/ZeroBucket.java # libs/exponential-histogram/src/test/java/org/elasticsearch/exponentialhistogram/ExponentialHistogramMergerTests.java
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I looked at the encoding and how it gets stored and left a question around that.
CompressedExponentialHistogram.writeHistogramBytes(histogramBytesOutput, scale, negativeBuckets, positiveBuckets); | ||
BytesRef histoBytes = histogramBytesOutput.bytes().toBytesRef(); | ||
| ||
Field histoField = new BinaryDocValuesField(fullPath(), histoBytes); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using binary doc values today means this would never be compressed (across multiple documents). I suspect the storage would be high for exponential histogram? Did we consider alternatives?
Looks like the encoding logic kind of stores a number of IndexWithCount
. Did we look into encoding each index and count pair using sorted numeric doc values? We can have two sorted number doc values, one count index and one for counts. Then we can have another sorted number document values that keeps track which count belongs to what index. Maybe this compresses better? We would need a similar structure for negative buckets.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I took inspiration from HistogramFieldMapper which also stores tuples of (value, count)
as encoded binary doc values with the difference that value
is a double. For that reason I assumed that this is the best option for storing tuples.
I don't quite understand how your suggested approach would work.
Let's say we have the following (index, count)
tuples to store: [(1, 10), (2, 30), (3, 20)]
.
The SortedNumericDocValuesField
for index would then store 1, 2, 3
and for the counts it would be 10, 20, 30
.
How would we store the relation between those two? We'd need a "position"-array storing the position within the counts for each index, which in this example would be 0,2,1
. I assumed there is no way of storing this, because SortedNumericDocValues
would reorder here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, sorted number doc values re-orders numbers at the document level (lowest to highest). We would need an extra doc values field to keep track of original order. We can stored as bytes maybe in a SortedDocvalues
.
Something similar was done for keep track of original array order for sythetic source if we synthesize from doc values. See FieldArrayContext
and how it gets used.
I think this could reduce storage required for exponential histograms, but we would need to test it. The current format is much simpler.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe in order to test it would could do a small experiment by generating some semi-realistic exponential histograms and direct store then into a lucene index with both formats and than compare index size?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given that this adds the new field mapper behind a feature flag, we can also explore this in a follow up. But I like to look into this before the feature flag gets removed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here are the experiment results:
- The custom encoding stored in a single
BinaryDocValue
yields 1.59 bytes per populated bucket - Storing indices and counts each in
SortedNumericDocValues
withFieldArrayContext
gives 3.34 bytes per bucket - The custom encoding stored in a single
SortedDocValue
yields 1.2 bytes, presumably due to LZ4 compression of the dictionary.
So I suggest we go with the custom encoding, but stored in a SortedDocValuesField
for now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried doing this (branch), but this causes a weird test failure, which seems to be fixed by this lucene change.
So I guess we should do the switch to SortedDocValuesField
later after upgrading to lucene 10.3?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using a terms dictionary for this kinda feels wrong. It adds more ingest overhead to create and de-duplicate the dictionary and adds another layer of indirection. But the compression does look compelling. As commented above, I think it would be interesting to try compressing values with (double) delta encoding to see if it brings benefits. Also, maybe there's a way to compress binary doc values with lz4 or zstd somehow, without having to create a terms dictionary?
...stogram/src/main/java/org/elasticsearch/xpack/exponentialhistogram/EncodedHistogramData.java Outdated Show resolved Hide resolved
...stogram/src/main/java/org/elasticsearch/xpack/exponentialhistogram/EncodedHistogramData.java Outdated Show resolved Hide resolved
...stogram/src/main/java/org/elasticsearch/xpack/exponentialhistogram/EncodedHistogramData.java Outdated Show resolved Hide resolved
...stogram/src/main/java/org/elasticsearch/xpack/exponentialhistogram/EncodedHistogramData.java Outdated Show resolved Hide resolved
private int negativeBucketsLength; | ||
private int positiveBucketsLength; | ||
| ||
void load(BytesRef data) throws IOException { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: decode
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be a factory method, to avoid mutations?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was done intentionally to avoid allocation when iterating over stored histograms. The same pattern is used in HistogramFieldMapper.
If this was a factory method, this would also give a wrong sense of lifetime/safety IINM: The byte array backing binaryValue()
returned from BinaryDocValues
is reused internally when advancing to the next method. So if we want to avoid mutation side effects here, we'd need to copy the bytes value, which we want to avoid for performance reasons.
…lasticsearch/xpack/exponentialhistogram/EncodedHistogramData.java Co-authored-by: Kostas Krikellas <131142368+kkrik-es@users.noreply.github.com>
…lasticsearch/xpack/exponentialhistogram/EncodedHistogramData.java Co-authored-by: Kostas Krikellas <131142368+kkrik-es@users.noreply.github.com>
…-mapper # Conflicts: # x-pack/plugin/mapper-exponential-histogram/src/main/java/org/elasticsearch/xpack/exponentialhistogram/EncodedHistogramData.java
} | ||
long minIndex = buckets.getFirst().index(); | ||
out.writeZLong(minIndex); | ||
long prevIndex = minIndex - 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need - 1
here and below?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here it is to avoid the special case of writing out the first bucket. But I can see how this can be confusing. I'll instead write out the count of the first bucket explicitly and start the for
loop below at the second list element. This should make this more clear.
Below I though it should be self explanatory:
If you have a bucket at index 42 and the next bucket is 44, the indexDelta
is 44-42=2
. The number of empty buckets inbetween is 2-1 = 1
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice.
...stogram/src/main/java/org/elasticsearch/xpack/exponentialhistogram/EncodedHistogramData.java Show resolved Hide resolved
...stogram/src/main/java/org/elasticsearch/xpack/exponentialhistogram/EncodedHistogramData.java Outdated Show resolved Hide resolved
...stogram/src/main/java/org/elasticsearch/xpack/exponentialhistogram/EncodedHistogramData.java Outdated Show resolved Hide resolved
...stogram/src/main/java/org/elasticsearch/xpack/exponentialhistogram/EncodedHistogramData.java Outdated Show resolved Hide resolved
...stogram/src/main/java/org/elasticsearch/xpack/exponentialhistogram/EncodedHistogramData.java Outdated Show resolved Hide resolved
...stogram/src/main/java/org/elasticsearch/xpack/exponentialhistogram/EncodedHistogramData.java Outdated Show resolved Hide resolved
...stogram/src/main/java/org/elasticsearch/xpack/exponentialhistogram/EncodedHistogramData.java Show resolved Hide resolved
.../main/java/org/elasticsearch/xpack/exponentialhistogram/ExponentialHistogramFieldMapper.java Outdated Show resolved Hide resolved
CompressedExponentialHistogram.writeHistogramBytes(histogramBytesOutput, scale, negativeBuckets, positiveBuckets); | ||
BytesRef histoBytes = histogramBytesOutput.bytes().toBytesRef(); | ||
| ||
Field histoField = new BinaryDocValuesField(fullPath(), histoBytes); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using a terms dictionary for this kinda feels wrong. It adds more ingest overhead to create and de-duplicate the dictionary and adds another layer of indirection. But the compression does look compelling. As commented above, I think it would be interesting to try compressing values with (double) delta encoding to see if it brings benefits. Also, maybe there's a way to compress binary doc values with lz4 or zstd somehow, without having to create a terms dictionary?
Co-authored-by: Felix Barnsteiner <felixbarny@users.noreply.github.com>
…improv * upstream/main: (58 commits) Fixing flaky LoggedExec (tests) (elastic#133215) CPS search should not use `skip_unavailable` (elastic#132927) Don't fail search if bottom doc can't be formatted (elastic#133188) Mute org.elasticsearch.xpack.esql.action.RandomizedTimeSeriesIT testGroupByNothing elastic#133225 Mute org.elasticsearch.index.codec.tsdb.es819.ES819TSDBDocValuesFormatTests testSortedSetDocValuesWithSkipperSmall elastic#133224 Mute org.elasticsearch.index.codec.tsdb.es819.ES819TSDBDocValuesFormatTests testSortedNumberMergeAwayAllValuesWithSkipper elastic#133223 Adding support for index.number_of_replicas to data stream settings (elastic#132748) Mute org.elasticsearch.index.codec.tsdb.es819.ES819TSDBDocValuesFormatTests testSortedDocValuesSingleUniqueValue elastic#133221 Fix VectorSimilarityFunctionsIT (elastic#133206) Mute org.elasticsearch.xpack.esql.action.RandomizedTimeSeriesIT testGroupBySubset elastic#133220 Increase the number of FORK branches in ForkGenerator (elastic#132019) Mute org.elasticsearch.index.mapper.blockloader.IpFieldBlockLoaderTests testBlockLoader {preference=Params[syntheticSource=true, preference=STORED]} elastic#133218 Mute org.elasticsearch.index.mapper.blockloader.IpFieldBlockLoaderTests testBlockLoader {preference=Params[syntheticSource=true, preference=DOC_VALUES]} elastic#133217 Mute org.elasticsearch.index.mapper.blockloader.IpFieldBlockLoaderTests testBlockLoader {preference=Params[syntheticSource=true, preference=NONE]} elastic#133216 Set default processor allocation for test clusters (elastic#133204) Add mapper for exponential histograms (elastic#132493) Fix offset handling in Murmur3Hasher (elastic#133193) unmute testDoesNotResolveClosedIndex (elastic#133115) Fix an AWS SDK v2 release note (elastic#133155) Limit the depth of a filter (elastic#133113) ...
* upstream/main: (58 commits) Fixing flaky LoggedExec (tests) (elastic#133215) CPS search should not use `skip_unavailable` (elastic#132927) Don't fail search if bottom doc can't be formatted (elastic#133188) Mute org.elasticsearch.xpack.esql.action.RandomizedTimeSeriesIT testGroupByNothing elastic#133225 Mute org.elasticsearch.index.codec.tsdb.es819.ES819TSDBDocValuesFormatTests testSortedSetDocValuesWithSkipperSmall elastic#133224 Mute org.elasticsearch.index.codec.tsdb.es819.ES819TSDBDocValuesFormatTests testSortedNumberMergeAwayAllValuesWithSkipper elastic#133223 Adding support for index.number_of_replicas to data stream settings (elastic#132748) Mute org.elasticsearch.index.codec.tsdb.es819.ES819TSDBDocValuesFormatTests testSortedDocValuesSingleUniqueValue elastic#133221 Fix VectorSimilarityFunctionsIT (elastic#133206) Mute org.elasticsearch.xpack.esql.action.RandomizedTimeSeriesIT testGroupBySubset elastic#133220 Increase the number of FORK branches in ForkGenerator (elastic#132019) Mute org.elasticsearch.index.mapper.blockloader.IpFieldBlockLoaderTests testBlockLoader {preference=Params[syntheticSource=true, preference=STORED]} elastic#133218 Mute org.elasticsearch.index.mapper.blockloader.IpFieldBlockLoaderTests testBlockLoader {preference=Params[syntheticSource=true, preference=DOC_VALUES]} elastic#133217 Mute org.elasticsearch.index.mapper.blockloader.IpFieldBlockLoaderTests testBlockLoader {preference=Params[syntheticSource=true, preference=NONE]} elastic#133216 Set default processor allocation for test clusters (elastic#133204) Add mapper for exponential histograms (elastic#132493) Fix offset handling in Murmur3Hasher (elastic#133193) unmute testDoesNotResolveClosedIndex (elastic#133115) Fix an AWS SDK v2 release note (elastic#133155) Limit the depth of a filter (elastic#133113) ...
Adds a new field type and mapper for OpenTelemetry exponential histograms.
The implementation was heavily inspired (aka partly copy-pasted) from the existing
HistogramFieldMapper
.Aggregations are not (yet) supported, but documents with this field type can be stored and loaded. I also did some manual testing with Discover in Kibana and things look fine there.
The new type looks like this:
Every sub-field except for
scale
is optional.This is not the most readable with the
indices
array, but this sparse representation is required, because a dense representation might not be possible for e.g. aggregation results.However, if desired we could support a dense representation as additional input format:
This would be exactly the opentelemetry representation. However, this adds some complexity:
I've not implemented this in this PR, as it is not strictly required. However, it would be great if you could give some feedback if we want to support this notation or not.