24-4: Fix follower assertions on attach snapshot races #15077
Merged
Add this suggestion to a batch that can be applied as a single commit. This suggestion is invalid because no changes were made to the code. Suggestions cannot be applied while the pull request is closed. Suggestions cannot be applied while viewing a subset of changes. Only one suggestion per line can be applied in a batch. Add this suggestion to a batch that can be applied as a single commit. Applying suggestions on deleted lines is not supported. You must change the existing code in this line in order to create a valid suggestion. Outdated suggestions cannot be applied. This suggestion has been applied or marked resolved. Suggestions cannot be applied from pending reviews. Suggestions cannot be applied on multi-line comments. Suggestions cannot be applied while the pull request is queued to merge. Suggestion cannot be applied right now. Please check back later.
Changelog entry
Fixed a rare assertion (process crash) when followers attached to leaders with an inconsistent snapshot. Fixes #15042.
Changelog category
Description for reviewers
Followers produced crashes in production periodically complaining about log reordering, with an error message indicating as if they tried to apply a duplicate redo log entry (which shouldn't have been possible). Turns out snapshots created within read-only transactions that used
QueueScan(e.g. ReadTable and ScanQuery) persisted an incorrectSerialfield (a monotonically increasing change number) that was equal to the next transaction. When follower attached at just the right time, it could bootstrap from such a snapshot, and discover the next commit has the sameSerial, indicating a duplicate or reordered change.Thankfully this didn't affect leaders, since they apply pre-snapshot and post-snapshot redo log entries together, and only use snapshot serial as a hint of previously compacted changes. So even though snapshot technically had an inconsistent value it was self-healing and couldn't produce any externally visible inconsistencies.