Skip to content

Conversation

martijnvg
Copy link
Member

This change adds a index.look_back_time index setting that sets the index.time_series.start_time setting for the first backing index when a data stream is created.

This allows accepting data that is older for initial indexing without changing the index.look_ahead_time setting. This setting also controls the index.time_series.end_time setting and would affect rollovers as well.

The default for the index.look_back_time is 2h, which means documents with @timestamp up to 2 hours after creation of the data stream are allowed to be indexed. This is the same as is without this change, because index.look_ahead_time is used to set index.time_series.start_time of the first backing index.

Closes #98463

This change adds a `index.look_back_time` index setting that sets the `index.time_series.start_time` setting for the first backing index when a data stream is created. This allows accepting data that is older for initial indexing without changing the `index.look_ahead_time` setting. This setting also controls the `index.time_series.end_time` setting and would affect rollovers as well. The default for the `index.look_back_time` is `2h`, which means documents with `@timestamp` up to 2 hours after creation of the data stream are allowed to be indexed. This is the same as is without this change, because `index.look_ahead_time` is used to set `index.time_series.start_time` of the first backing index. Closes elastic#98463
@github-actions
Copy link
Contributor

Documentation preview:

@elasticsearchmachine
Copy link
Collaborator

Hi @martijnvg, I've created a changelog YAML for you.

@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-analytics-geo (Team:Analytics)

@elasticsearchmachine elasticsearchmachine added the Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) label Aug 16, 2023
@lalit-satapathy
Copy link

The default for the index.look_back_time is 2h, which means documents with @timestamp up to 2 hours after creation of the data stream are allowed to be indexed.

Wondering what is an ideal look back time default value and should we make it higher than 2h? Making a higher default will ensure bigger TSDB time window and that would lead to less probability of a document drop for older timestamp. Are there are any risks to make it higher?

@martijnvg
Copy link
Member Author

The look back setting is useful for metric integrations that have some data before the $now - 2h period. The expectation is that data volume would be lower than for the data that is to be received for $now + 2h and $now - 2h.

The default for the loopback setting will remain $now - 2h, and integrations that need to accept initial data older than $now - 2h can overwrite this setting in the template.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

>enhancement :StorageEngine/TSDB You know, for Metrics Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) v8.11.0

4 participants