- Notifications
You must be signed in to change notification settings - Fork 25.7k
Configure index sorting through index settings for logsdb #118968
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
# Conflicts: # server/src/main/java/org/elasticsearch/index/IndexSettings.java # server/src/main/java/org/elasticsearch/index/IndexSortConfig.java # x-pack/plugin/logsdb/src/main/java/org/elasticsearch/xpack/logsdb/LogsdbIndexModeSettingsProvider.java
martijnvg left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think introducing a setting that is set by index provider when we can index sort on hostname is more robust.
I left a few comments, otherwise looks good!
| } | ||
| } | ||
| | ||
| // TODO: remove this when _source.mode attribute has been removed: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's keep this TODO here. After #119072 has been merged, let's try to optimize this in order to address the elastic/elasticsearch-benchmarks#2232 regression.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we optimize it, though? With this change, we still need to check dynamic templates etc for the presence of host.name.
| "index.logsdb.use_default_sort_config", | ||
| false, | ||
| Property.IndexScope, | ||
| Property.Final |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should add Property.ServerlessPublic also here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Iiuc Property.ServerlessPublic is used to allow users to set this. This setting should never be set by users, it's only introduced to have the logsdb settings provider control the sort config. Do I understand this correctly?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe in that case we should add Property.PrivateIndex? I think this would ensure that only we can use it here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good call, done.
...gin/logsdb/src/main/java/org/elasticsearch/xpack/logsdb/LogsdbIndexModeSettingsProvider.java Outdated Show resolved Hide resolved
...gin/logsdb/src/main/java/org/elasticsearch/xpack/logsdb/LogsdbIndexModeSettingsProvider.java Outdated Show resolved Hide resolved
# Conflicts: # server/src/main/java/org/elasticsearch/index/IndexVersions.java
| I didn't realize the fact that more integrations are relying on dynamic mappings for host.name field. I think in that case we do want to inject the host.name field, so that we can configure host.name as primary index sort field. Then I think for this PR, we want to just check mappings for cases when injecting host.name field leads to mapping conflicts? I think just check that |
Makes sense to me.
Do we also need to check if |
Let me try to incorporate all this in the PR and add tests. |
# Conflicts: # rest-api-spec/build.gradle
| I think we can try to merge in the static host.name mapping using mapper service and check if it succeeds. If it does succeed, then we should sort on host.name and timestamp otherwise just on timestamp. |
| Will the merge succeed if the users' mapping also maps |
Good point. We should check whether it has doc values and otherwise fallback to index sorting only by timestamp. |
martijnvg left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
| "index.logsdb.use_default_sort_config", | ||
| false, | ||
| Property.IndexScope, | ||
| Property.Final |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe in that case we should add Property.PrivateIndex? I think this would ensure that only we can use it here.
💔 Backport failed
You can use sqren/backport to manually backport by running |
💚 All backports created successfully
Questions ?Please refer to the Backport tool documentation |
…8968) * Skip injecting `host.name` for incompatible mappings in logsdb mode * spotless * Update docs/changelog/118856.yaml * fix tsdb * Configure index sorting through index settings for logsdb * fix synthetic source usage * skip injecting host.name * fix test * fix compat * more tests * add index versioning * add index versioning * add index versioning * minor refactoring * Update docs/changelog/118968.yaml * address comments * inject host.name when possible * check subobjects * private settings (cherry picked from commit 5baf5af) # Conflicts: # rest-api-spec/build.gradle # server/src/main/java/org/elasticsearch/index/IndexVersions.java # server/src/main/resources/META-INF/services/org.elasticsearch.features.FeatureSpecification # x-pack/plugin/logsdb/src/main/java/org/elasticsearch/xpack/logsdb/LogsDBPlugin.java
) (#119936) * Configure index sorting through index settings for logsdb (#118968) * Skip injecting `host.name` for incompatible mappings in logsdb mode * spotless * Update docs/changelog/118856.yaml * fix tsdb * Configure index sorting through index settings for logsdb * fix synthetic source usage * skip injecting host.name * fix test * fix compat * more tests * add index versioning * add index versioning * add index versioning * minor refactoring * Update docs/changelog/118968.yaml * address comments * inject host.name when possible * check subobjects * private settings (cherry picked from commit 5baf5af) # Conflicts: # rest-api-spec/build.gradle # server/src/main/java/org/elasticsearch/index/IndexVersions.java # server/src/main/resources/META-INF/services/org.elasticsearch.features.FeatureSpecification # x-pack/plugin/logsdb/src/main/java/org/elasticsearch/xpack/logsdb/LogsDBPlugin.java * update index versions * [CI] Auto commit changes from spotless * update index versions --------- Co-authored-by: elasticsearchmachine <infra-root+elasticsearchmachine@elastic.co>
…astic#118968)" This reverts commit 5baf5af.
Indexes configured with logsdb mode currently use sorting on
[host.name, @timestamp]by default, with keywordhost.namegetting injected if not present. This can lead to problems, if e.g.hostis a leaf field, while sorting on an emptyhost.nameis wasteful.To fix this, the logsdb index settings provider is extended to detect when field
host.nameis present and trigger sorting on it (in addition to@timestamp) by injecting an index setting that is later used inIndexSortConfigduringIndexSettingsinitialization. Otherwise, logsdb indexes are only sorted on@timestampby default.The downside is that the provider may need to inspect mappings through
MapperServicemore often, a costly operation. The impact will be monitored in nightlies, once this is submitted, and logic will be further optimized as needed.Fixes #118686