- Notifications
You must be signed in to change notification settings - Fork 25.5k
Add _metric_names_hash field to OTel metric mappings #120952
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
If metrics that have the same timestamp and dimensions aren't grouped into the same document, ES will consider them to be a duplicate. The _metric_names_hash field will be set by the OTel ES exporter. As it's mapped as a time_series_dimensions, it creates a different _tsid for documents with different sets of metrics. The tradeoff is that if the composition of the metrics grouping changes over time, a different _tsid will be created. That has an impact on the rate aggregation for counters.
Pinging @elastic/es-data-management (Team:Data Management) |
Hi @felixbarny, I've created a changelog YAML for you. |
priority: 10 | ||
# workaround for https://github.com/elastic/elasticsearch/issues/99123 | ||
_metric_names_hash: | ||
type: keyword |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Q: will a number be more lightweight, as you're using a 8 digit hex anyway?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At the moment, numbers can't leverage run-length encoding. So it's actually lighter to use a keyword here as all dimensions are incorporated into the _tsid, which we sort by. Therefore, all values for the same tsid are equal and can be compressed very efficiently.
I had a discussion with @martijnvg about this last week. The conclusion was that this change makes the consequences of imperfect grouping much less bad and we should therefore move forward with it. It's not a replacement for improving the grouping logic. But it's much better to have a different time series rather than dropping metrics. It's a much less stressful situation having to debug why the rate aggregation isn't working properly in some cases rather than debugging a data loss scenario. Longer-term, it seems like we'll go into the one metric per doc route where grouping of metrics isn't required anymore. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
If metrics that have the same timestamp and dimensions aren't grouped into the same document, ES will consider them to be a duplicate. The _metric_names_hash field will be set by the OTel ES exporter. As it's mapped as a time_series_dimensions, it creates a different _tsid for documents with different sets of metrics. The tradeoff is that if the composition of the metrics grouping changes over time, a different _tsid will be created. That has an impact on the rate aggregation for counters.
If metrics that have the same timestamp and dimensions aren't grouped into the same document, ES will consider them to be a duplicate. The _metric_names_hash field will be set by the OTel ES exporter. As it's mapped as a time_series_dimensions, it creates a different _tsid for documents with different sets of metrics. The tradeoff is that if the composition of the metrics grouping changes over time, a different _tsid will be created. That has an impact on the rate aggregation for counters.
If metrics that have the same timestamp and dimensions aren't grouped into the same document, ES will consider them to be a duplicate. The _metric_names_hash field will be set by the OTel ES exporter. As it's mapped as a time_series_dimensions, it creates a different _tsid for documents with different sets of metrics. The tradeoff is that if the composition of the metrics grouping changes over time, a different _tsid will be created. That has an impact on the rate aggregation for counters.
If metrics that have the same timestamp and dimensions aren't grouped into the same document, ES will consider them to be a duplicate. The _metric_names_hash field will be set by the OTel ES exporter. As it's mapped as a time_series_dimensions, it creates a different _tsid for documents with different sets of metrics. The tradeoff is that if the composition of the metrics grouping changes over time, a different _tsid will be created. That has an impact on the rate aggregation for counters.
If metrics that have the same timestamp and dimensions aren't grouped into the same document, ES will consider them to be a duplicate. The _metric_names_hash field will be set by the OTel ES exporter. As it's mapped as a time_series_dimensions, it creates a different _tsid for documents with different sets of metrics. The tradeoff is that if the composition of the metrics grouping changes over time, a different _tsid will be created. That has an impact on the rate aggregation for counters.
If metrics that have the same timestamp and dimensions aren't grouped into the same document, ES will consider them to be a duplicate. The _metric_names_hash field will be set by the OTel ES exporter. As it's mapped as a time_series_dimensions, it creates a different _tsid for documents with different sets of metrics. The tradeoff is that if the composition of the metrics grouping changes over time, a different _tsid will be created. That has an impact on the rate aggregation for counters.
If metrics that have the same timestamp and dimensions aren't grouped into the same document, ES will consider them to be a duplicate. The _metric_names_hash field will be set by the OTel ES exporter. As it's mapped as a time_series_dimensions, it creates a different _tsid for documents with different sets of metrics. The tradeoff is that if the composition of the metrics grouping changes over time, a different _tsid will be created. That has an impact on the rate aggregation for counters.
…tions (#37511) If metrics that have the same timestamp and dimensions aren't grouped into the same document, ES will consider them to be a duplicate. This adds a hash of the metric names that will be mapped as a dimension in Elasticsearch. The tradeoff is that if the composition of the metrics grouping changes over time, a new time series will be created. That has an impact on the rate aggregation for counters. ES mapping changes: elastic/elasticsearch#120952 --------- Co-authored-by: Carson Ip <carsonip@users.noreply.github.com>
💚 All backports created successfully
Questions ?Please refer to the Backport tool documentation |
If metrics that have the same timestamp and dimensions aren't grouped into the same document, ES will consider them to be a duplicate. The _metric_names_hash field will be set by the OTel ES exporter. As it's mapped as a time_series_dimensions, it creates a different _tsid for documents with different sets of metrics. The tradeoff is that if the composition of the metrics grouping changes over time, a different _tsid will be created. That has an impact on the rate aggregation for counters. (cherry picked from commit 5e8865d)
If metrics that have the same timestamp and dimensions aren't grouped into the same document, ES will consider them to be a duplicate. The _metric_names_hash field will be set by the OTel ES exporter. As it's mapped as a time_series_dimensions, it creates a different _tsid for documents with different sets of metrics. The tradeoff is that if the composition of the metrics grouping changes over time, a different _tsid will be created. That has an impact on the rate aggregation for counters. (cherry picked from commit 5e8865d) Co-authored-by: Felix Barnsteiner <felixbarny@users.noreply.github.com>
…elastic#126850) Bump otel-data plugin version as elastic#120952 missed the bump.
…elastic#126850) Bump otel-data plugin version as elastic#120952 missed the bump.
…elastic#126850) Bump otel-data plugin version as elastic#120952 missed the bump.
…elastic#126850) Bump otel-data plugin version as elastic#120952 missed the bump. (cherry picked from commit 5860ccb) # Conflicts: # x-pack/plugin/otel-data/src/main/resources/resources.yaml
A short-term workaround for #99123
If metrics that have the same timestamp and dimensions aren't grouped into the same document, ES will consider them to be a duplicate. The
_metric_names_hash
field will be set by the OTel ES exporter (see open-telemetry/opentelemetry-collector-contrib#37511). As it's mapped as a time_series_dimensions, it creates a different _tsid for documents with different sets of metrics. The tradeoff is that if the composition of the metrics grouping changes over time, a different _tsid will be created. That has an impact on the rate aggregation for counters.