Skip to content

Conversation

kkrik-es
Copy link
Contributor

This is a storage optimization, improving codec efficiency. It requires configuring more than one sort fields (excluding @timestamp) to reduce the likelihood for hotspots in shard routing.

Related to #109334

@elasticsearchmachine
Copy link
Collaborator

Hi @kkrik-es, I've created a changelog YAML for you.

@kkrik-es
Copy link
Contributor Author

kkrik-es commented Nov 14, 2024

EDIT: ignore this, more results below.

results.txt
ESBench results look really promising:

| Cumulative indexing time of primary shards | 1263.28 | 591.821 | -671.462 | min | -53.15% | | Dataset size | 35.842 | 14.9673 | -20.8747 | GB | -58.24% | | Store size | 35.842 | 14.9673 | -20.8747 | GB | -58.24% | 
@kkrik-es kkrik-es requested a review from martijnvg November 14, 2024 06:14
@kkrik-es kkrik-es marked this pull request as ready for review November 14, 2024 06:14
@kkrik-es kkrik-es requested a review from a team as a code owner November 14, 2024 06:14
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-storage-engine (Team:StorageEngine)

# Conflicts: #	server/src/main/java/org/elasticsearch/common/TimeBasedKOrderedUUIDGenerator.java #	server/src/main/java/org/elasticsearch/common/UUIDs.java
@kkrik-es
Copy link
Contributor Author

ESBench results look really promising:

There was an issue with the run above, there were many indexing errors due to missing sort field values. Updated results (sortin on [host.name, host.os.name, @timestamp):

| Cumulative indexing time of primary shards | 1263.28 | 1158.67 | -104.609 | min | -8.28% | | Dataset size | 35.842 | 29.9704 | -5.87159 | GB | -16.38% | | Store size | 35.842 | 29.9704 | -5.87159 | GB | -16.38% | | Segment count | 1052 | 687 | -365 | | -34.70% | 

Smaller but still sizable wins, esp on indexing time.

# Conflicts: #	server/src/main/java/org/elasticsearch/cluster/routing/IndexRouting.java
Copy link
Member

@martijnvg martijnvg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 🎉

@kkrik-es kkrik-es added auto-backport Automatically create backport pull requests when merged v8.18.0 labels Dec 19, 2024
@kkrik-es kkrik-es merged commit d80cbdd into elastic:main Dec 19, 2024
16 checks passed
@elasticsearchmachine
Copy link
Collaborator

💔 Backport failed

Status Branch Result
8.x Commit could not be cherrypicked due to conflicts

You can use sqren/backport to manually backport by running backport --upstream elastic/elasticsearch --pr 116687

@kkrik-es
Copy link
Contributor Author

💚 All backports created successfully

Status Branch Result
8.x

Questions ?

Please refer to the Backport tool documentation

kkrik-es added a commit to kkrik-es/elasticsearch that referenced this pull request Dec 19, 2024
* Add LogsDB option to route on sort fields * fix encoding * Update docs/changelog/116687.yaml * tests * tests * tests * fix mode * tests * tests * tests * add test * fix test * sync * updates from review * test fixes * test fixes * test fixes * Move logic to SyntheticSourceIndexSettingsProvider * fix test * sync * merge, no fallback * comments * fix test * address comments * address comments * address comments * Update x-pack/plugin/logsdb/src/main/java/org/elasticsearch/xpack/logsdb/LogsdbIndexModeSettingsProvider.java Co-authored-by: Martijn van Groningen <martijn.v.groningen@gmail.com> * [CI] Auto commit changes from spotless * update tests * [CI] Auto commit changes from spotless * update tests * fix rest compat tests --------- Co-authored-by: Martijn van Groningen <martijn.v.groningen@gmail.com> Co-authored-by: elasticsearchmachine <infra-root+elasticsearchmachine@elastic.co> (cherry picked from commit d80cbdd) # Conflicts: #	rest-api-spec/build.gradle #	server/src/main/java/org/elasticsearch/common/TimeBasedKOrderedUUIDGenerator.java
elasticsearchmachine pushed a commit that referenced this pull request Dec 19, 2024
* Add LogsDB option to route on sort fields (#116687) * Add LogsDB option to route on sort fields * fix encoding * Update docs/changelog/116687.yaml * tests * tests * tests * fix mode * tests * tests * tests * add test * fix test * sync * updates from review * test fixes * test fixes * test fixes * Move logic to SyntheticSourceIndexSettingsProvider * fix test * sync * merge, no fallback * comments * fix test * address comments * address comments * address comments * Update x-pack/plugin/logsdb/src/main/java/org/elasticsearch/xpack/logsdb/LogsdbIndexModeSettingsProvider.java Co-authored-by: Martijn van Groningen <martijn.v.groningen@gmail.com> * [CI] Auto commit changes from spotless * update tests * [CI] Auto commit changes from spotless * update tests * fix rest compat tests --------- Co-authored-by: Martijn van Groningen <martijn.v.groningen@gmail.com> Co-authored-by: elasticsearchmachine <infra-root+elasticsearchmachine@elastic.co> (cherry picked from commit d80cbdd) # Conflicts: #	rest-api-spec/build.gradle #	server/src/main/java/org/elasticsearch/common/TimeBasedKOrderedUUIDGenerator.java * Update LogsIndexingIT.java
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

auto-backport Automatically create backport pull requests when merged backport pending >enhancement :StorageEngine/Logs You know, for Logs Team:StorageEngine test-full-bwc Trigger full BWC version matrix tests v8.18.0 v9.0.0

3 participants