- Notifications
You must be signed in to change notification settings - Fork 25.5k
Description
Elasticsearch Version
8.6.2
Installed Plugins
No response
Java Version
bundled
OS Version
3.16.0-4-amd64
Problem Description
When a aggregate_metric_double
field is created with a subset of metrics sub-fields mappings for downsampled index is created with the full complement of sub-fields resulting in the down-sampling process losing all docs.
Steps to Reproduce
Create a TSDS index with a mapping template like so:
{ "template": { "mappings": { "properties": { "@timestamp": { "type": "date", "format": "strict_date_optional_time" }, "host": { "type": "keyword", "time_series_dimension": true }, "calls": { "type": "aggregate_metric_double", "metrics": [ "sum" ], "default_metric": "sum", "time_series_metric": "gauge" } } } } }
Now create the TSDS data stream and you should have a set of backing index with a mapping that looks something like:
{ ".ds-tsds-test-2023.05.11-000002": { "mappings": { "_data_stream_timestamp": { "enabled": true }, "properties": { "@timestamp": { "type": "date", "format": "strict_date_optional_time" }, "host": { "type": "keyword", "time_series_dimension": true }, "calls": { "type": "aggregate_metric_double", "metrics": [ "sum" ], "default_metric": "sum", "time_series_metric": "gauge" } } } } }
If you then have an ILM policy that down samples for a stage, e.g.:
"warm": { "min_age": "7d", "actions": { "set_priority": { "priority": 40 }, "downsample": { "fixed_interval": "5m" }, "shrink": { "number_of_shards": 1 }, "forcemerge": { "max_num_segments": 1 } } }
And trigger it either by waiting the given time or adjusting the index age you will end up with a down-sampled index with a mapping like the following and with 0 docs:
{ "shrink-ujei-downsample-te_v-.ds-tsds-test-2023.05.11-000001": { "mappings": { "_data_stream_timestamp": { "enabled": true }, "dynamic_templates": [ { "strings": { "match_mapping_type": "string", "mapping": { "type": "keyword" } } } ], "properties": { "@timestamp": { "type": "date", "meta": { "fixed_interval": "5m", "time_zone": "UTC" } }, "host": { "type": "keyword", "time_series_dimension": true }, "calls": { "type": "aggregate_metric_double", "metrics": [ "min", "max", "sum", "value_count" ], "default_metric": "max", "time_series_metric": "gauge" } } } } }
Notice how the min
, max
, and value_count
sub-fields exist in the downsampled mapping but not in the template / original mapping. This in turn causes the downsampled docs to be not indexable resulting in all data being dropped.
Logs (if relevant)
[2023-05-11T20:11:25,576][ERROR][o.e.x.d.RollupShardIndexer] [es1.test.example.com] Shard [[.ds-tsds-test-2023.05.11-000001][4]] failed to populate rollup index. Failures: [{null=org.elasticsearch.index.mapper.MapperParsingException: failed to parse field [calls] of type [aggregate_metric_double] in a time series document at [2023-05-10T07:35:00.000Z]. Preview of field's value: 'null',...