Data distribution improvements #2526

etschannen · 2020-01-10T02:30:45Z

These changes improve how data distribution keeps load balanced across servers, improves how shard merges are handled, and improves how data distribution functions when recovering from a storage server failure.

…s of the source unless the team matches exactly

…idth for 5 minutes

… an unmerged shard

…ution rebalancing

fdbserver/Knobs.cpp

xumengpanda · 2020-01-10T04:39:36Z

fdbserver/DataDistribution.actor.cpp

+bestSize = teamList[j]->size();
 }
 }
+break;


Suppose completeSources has servers: s1, s2, s3, s4, s5; none of s1's teams is a subset of {s1, s2, s3, s4, s5}, which means found = false and bestOption is not found.
However, is it possible that s2 has a team {s2, s3, s4}? In this case, the bestOption exists but the for-loop at Line 765 fails to find it.

I guess also you could have a situation where s1 has a valid team, but there exists a larger one that doesn't include it.

xumengpanda · 2020-01-10T04:41:53Z

fdbserver/DataDistribution.actor.cpp

-auto& teamList = self->server_info[ req.sources[i] ]->teams;
+int bestSize = 0;
+for( int i = 0; i < req.completeSources.size(); i++ ) {
+if( self->server_info.count( req.completeSources[i] ) ) {


I think changing the if to
if( !self->server_info.count( req.completeSources[i] ) ) {continue;} is more intuitive.

The purpose of the for-loop is only to find the first server of req.completeSources that is in server_info. It does not loop back once it finds the first such server.

Since we are looking for the first value to satisfy some criteria, we could also write this using std::find_if. I'm ok with the current code or Meng's suggestion, too, though.

xumengpanda · 2020-01-10T04:56:18Z

fdbserver/DataDistribution.actor.cpp

-}
-
 int bestSize = 0;
 for( int i = 0; i < req.completeSources.size(); i++ ) {


Not sure how often duplicate servers exist in req.completeSources.
If it is not rare, using the complementSources set calculated at the beginning of this function has performance benefit, since the getTeam() function is in hot path.

Complete sources should never have duplicates, it was only turned into a set to do a count to lookup if ids exists. getTeam only called roughly once per shard moved, so it is not hot enough to the point that it should cause CPU problems for the DD algorithm.

xumengpanda · 2020-01-10T05:15:35Z

fdbserver/DataDistributionTracker.actor.cpp

 auto shardBounds = getShardSizeBounds( merged, maxShardSize );
 if( endingStats.bytes >= shardBounds.min.bytes ||
 getBandwidthStatus( endingStats ) != BandwidthStatusLow ||
+now() - lastLowBandwidthTime < SERVER_KNOBS->DD_LOW_BANDWIDTH_DELAY ||


I understand and agree we should not merge a shard unless it has been in lowBandwidth status for a while.

This change will increase the number of shards. Do we have a rough estimation on the extra number of shards this change may potentially increase? (The DD_LOW_BANDWIDTH_DELAY value may affect the number.)

Is the current number of shards close to the maximum number of shards a cluster can tolerate without experiencing performance issues?

The effect this will have on the number of shards is heavily dependent on the client workload. There is very little risk that this will increase the shard count enough to impact the database performance, but we should monitor the number of shards after we roll out this change.

ajbeamon · 2020-01-10T17:31:41Z

@fdb-build test this please

ajbeamon · 2020-01-10T17:44:35Z

fdbserver/DataDistribution.actor.cpp

+std::set<UID> completeSources;
+for( int i = 0; i < req.completeSources.size(); i++ ) {
+completeSources.insert( req.completeSources[i] );
+}


This can also be written as:

std::set<UID> completeSources(req.completeSources.begin(), req.completeSources.end());

And also if you use this constructor, you should be able to mark it as const, which could clear up any ambiguity about whether you are duplicating the original data so that you can modify it or to just use as a lookup.

ajbeamon · 2020-01-10T17:51:31Z

fdbserver/DataDistribution.actor.cpp

-auto& teamList = self->server_info[ req.sources[i] ]->teams;
+int bestSize = 0;
+for( int i = 0; i < req.completeSources.size(); i++ ) {
+if( self->server_info.count( req.completeSources[i] ) ) {


Since we are looking for the first value to satisfy some criteria, we could also write this using std::find_if. I'm ok with the current code or Meng's suggestion, too, though.

ajbeamon · 2020-01-10T18:10:11Z

fdbserver/DataDistribution.actor.cpp

+bestSize = teamList[j]->size();
 }
 }
+break;


I guess also you could have a situation where s1 has a valid team, but there exists a larger one that doesn't include it.

ajbeamon · 2020-01-10T18:37:25Z

fdbserver/DataDistributionTracker.actor.cpp

 StorageMetrics metrics = wait( tr.waitStorageMetrics( keys, bounds.min, bounds.max, bounds.permittedError, CLIENT_KNOBS->STORAGE_METRICS_SHARD_LIMIT ) );
+BandwidthStatus newBandwidthStatus = getBandwidthStatus( metrics );
+if(newBandwidthStatus == BandwidthStatusLow && bandwidthStatus != BandwidthStatusLow) {
+lastLowBandwidthTime = now();


I think the name of this might cause some confusion, as it to me implies the last time the shard was at low bandwidth rather than the time when low bandwidth started. Maybe something like lastLowBandwidthStartTime is clearer?

ajbeamon · 2020-01-10T21:52:04Z

The change to the LAST_LIMITED_RATIO knob may resolve #1884.

… check all completeSources we do not need to track bestSize, since all teams in the list will be the same size

…as elapses from becoming low bandwidth

…date it based on shardSize changes

etschannen added 4 commits January 9, 2020 16:59

Data distribution no longer attempts to pick teams which share member…

ab70719

…s of the source unless the team matches exactly

Data distribution will not merge a shard unless it has been low bandw…

e4fa4ad

…idth for 5 minutes

raised the priority of shard merges, because the tracker cannot track…

9842272

… an unmerged shard

batch priority must be heavily throttled before stopping data distrib…

02a8e8d

…ution rebalancing

etschannen requested a review from ajbeamon January 10, 2020 02:30

dongxinEric reviewed Jan 10, 2020

View reviewed changes

fdbserver/Knobs.cpp Show resolved Hide resolved

xumengpanda reviewed Jan 10, 2020

View reviewed changes

ajbeamon reviewed Jan 10, 2020

View reviewed changes

fix: completeSources could be larger than the teamSize, so we need to…

c2608f0

… check all completeSources we do not need to track bestSize, since all teams in the list will be the same size

xumengpanda approved these changes Jan 10, 2020

View reviewed changes

Added a trace event to warn if a shard is merged before enough time h…

9b80498

…as elapses from becoming low bandwidth

ajbeamon approved these changes Jan 11, 2020

View reviewed changes

etschannen added 2 commits January 10, 2020 16:28

HasBeenTrueFor was ready immediately after a previous shard merge

fde53cb

wantsToMerge was created before the shardEvaluator has a chance to up…

b331c5d

…date it based on shardSize changes

etschannen merged commit 17e97f2 into apple:release-6.2 Jan 11, 2020

etschannen deleted the feature-dd-improvements branch January 13, 2020 22:17

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Data distribution improvements #2526

Data distribution improvements #2526

Uh oh!

etschannen commented Jan 10, 2020

Uh oh!

xumengpanda Jan 10, 2020

ajbeamon Jan 10, 2020

xumengpanda Jan 10, 2020

ajbeamon Jan 10, 2020

xumengpanda Jan 10, 2020

etschannen Jan 10, 2020

xumengpanda Jan 10, 2020

etschannen Jan 10, 2020

ajbeamon commented Jan 10, 2020

ajbeamon Jan 10, 2020

ajbeamon Jan 10, 2020

ajbeamon Jan 10, 2020

ajbeamon Jan 10, 2020

ajbeamon Jan 10, 2020

ajbeamon commented Jan 10, 2020

Labels

4 participants

Data distribution improvements #2526

Data distribution improvements #2526

Uh oh!

Conversation

etschannen commented Jan 10, 2020

Uh oh!

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ajbeamon commented Jan 10, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ajbeamon commented Jan 10, 2020

Labels

4 participants