Skip to content

Conversation

@idegtiarenko
Copy link
Contributor

This change takes into account expectedShardSize when simulating new shard initialization during desired balance computation.
This should reduce balance movements when restoring a shard from snapshot as it has known size beforehand.

@idegtiarenko idegtiarenko added >enhancement :Distributed Indexing/Distributed A catch all label for anything in the Distributed Indexing Area. Please avoid if you can. :Distributed Coordination/Allocation All issues relating to the decision making around placing a shard (both master logic & on the nodes) v8.9.0 labels May 2, 2023
@idegtiarenko idegtiarenko requested a review from DaveCTurner May 2, 2023 12:50
@idegtiarenko idegtiarenko added Team:Distributed (Obsolete) Meta label for distributed team (obsolete). Replaced by Distributed Indexing/Coordination. and removed :Distributed Indexing/Distributed A catch all label for anything in the Distributed Indexing Area. Please avoid if you can. labels May 2, 2023
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-distributed (Team:Distributed)

Copy link
Contributor

@DaveCTurner DaveCTurner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I guess there's no need for a max here since the caller ignores negative values.

Suggested change
return Math.max(0L, shard.getExpectedShardSize());
return shard.getExpectedShardSize();
…sage This change asserts that produced balance does not utilize all disk space.
@idegtiarenko idegtiarenko force-pushed the assert_balanced_disk_usage branch from eaca950 to 5aeb4d9 Compare May 2, 2023 14:01
if (replicaNodeId != null) {
dataPath.put(new NodeAndShard(replicaNodeId, shardId), "/data");
usedDiskSpace.compute(primaryNodeId, (k, v) -> v + thisShardSize);
usedDiskSpace.compute(replicaNodeId, (k, v) -> v + thisShardSize);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was causing some of the assertions failing

@elasticsearchmachine
Copy link
Collaborator

Hi @idegtiarenko, I've created a changelog YAML for you.

var deviation = randomIntBetween(0, 50) - 100L;
return originalSize * (1000 + deviation) / 1000;
var deviation = randomIntBetween(0, 10) - 5;
return originalSize * (100 + deviation) / 100;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Previously this was only producing shards that are 5-10% smaller then original size, while this should generate -5..+5%

@idegtiarenko idegtiarenko requested a review from DaveCTurner May 3, 2023 07:49
Copy link
Contributor

@DaveCTurner DaveCTurner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM still. I think there were only really test changes here, but you force-pushed so I can't see the diffs.

private static long smallShardSizeDeviation(long originalSize) {
var deviation = randomIntBetween(0, 50) - 100L;
return originalSize * (1000 + deviation) / 1000;
var deviation = randomIntBetween(0, 10) - 5;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit/suggestion:

Suggested change
var deviation = randomIntBetween(0, 10) - 5;
var deviation = randomIntBetween(-5, 5);

or maybe even drop the 100 + and do this:

Suggested change
var deviation = randomIntBetween(0, 10) - 5;
var deviation = randomIntBetween(95, 105);
@idegtiarenko idegtiarenko merged commit 251f923 into elastic:main May 4, 2023
@idegtiarenko idegtiarenko deleted the assert_balanced_disk_usage branch May 4, 2023 08:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

:Distributed Coordination/Allocation All issues relating to the decision making around placing a shard (both master logic & on the nodes) >enhancement Team:Distributed (Obsolete) Meta label for distributed team (obsolete). Replaced by Distributed Indexing/Coordination. v8.9.0

3 participants