Skip to content

Conversation

@snaury
Copy link
Member

@snaury snaury commented Feb 26, 2025

Changelog entry

Fixed a rare assertion (process crash) when followers attached to leaders with an inconsistent snapshot. Fixes #15042.

Changelog category

  • Bugfix

Description for reviewers

Followers produced crashes in production periodically complaining about log reordering, with an error message indicating as if they tried to apply a duplicate redo log entry (which shouldn't have been possible). Turns out snapshots created within read-only transactions that used QueueScan (e.g. ReadTable and ScanQuery) persisted an incorrect Serial field (a monotonically increasing change number) that was equal to the next transaction. When follower attached at just the right time, it could bootstrap from such a snapshot, and discover the next commit has the same Serial, indicating a duplicate or reordered change.

Thankfully this didn't affect leaders, since they apply pre-snapshot and post-snapshot redo log entries together, and only use snapshot serial as a hint of previously compacted changes. So even though snapshot technically had an inconsistent value it was self-healing and couldn't produce any externally visible inconsistencies.

@snaury snaury self-assigned this Feb 26, 2025
@github-actions
Copy link

github-actions bot commented Feb 26, 2025

2025-02-26 14:23:21 UTC Pre-commit check for b9dbf12 has started.
2025-02-26 14:26:07 UTC Build linux-x86_64-relwithdebinfo is running...
🟢 2025-02-26 15:06:43 UTC Build successful.
2025-02-26 15:06:59 UTC Tests are running...
🔴 2025-02-26 16:36:55 UTC Some tests failed, follow the links below.

Test history | Test log

TESTS PASSED ERRORS FAILED SKIPPED MUTED?
15019 13659 0 23 1301 36

🟢 2025-02-26 16:37:52 UTC ydbd size 8.3 GiB changed* by +792 Bytes, which is < 100.0 KiB vs stable-24-4: OK

ydbd size dash stable-24-4: bd3c9d7 merge: b9dbf12 diff diff %
ydbd size 8 896 418 032 Bytes 8 896 418 824 Bytes +792 Bytes +0.000%
ydbd stripped size 484 973 160 Bytes 484 973 224 Bytes +64 Bytes +0.000%

*please be aware that the difference is based on comparing your commit and the last completed build from the post-commit, check comparation

@github-actions
Copy link

github-actions bot commented Feb 26, 2025

2025-02-26 14:23:43 UTC Pre-commit check for b9dbf12 has started.
2025-02-26 14:26:24 UTC Build linux-x86_64-release-asan is running...
🟢 2025-02-26 14:51:03 UTC Build successful.
2025-02-26 14:51:21 UTC Tests are running...
🔴 2025-02-26 17:07:16 UTC Some tests failed, follow the links below.

Test history | Test log

TESTS PASSED ERRORS FAILED SKIPPED MUTED?
10586 10476 0 36 26 48

🟢 2025-02-26 17:08:33 UTC ydbd size 5.6 GiB changed* by +512 Bytes, which is < 100.0 KiB vs stable-24-4: OK

ydbd size dash stable-24-4: bd3c9d7 merge: b9dbf12 diff diff %
ydbd size 6 026 457 120 Bytes 6 026 457 632 Bytes +512 Bytes +0.000%
ydbd stripped size 1 508 253 056 Bytes 1 508 253 120 Bytes +64 Bytes +0.000%

*please be aware that the difference is based on comparing your commit and the last completed build from the post-commit, check comparation

@snaury snaury marked this pull request as ready for review February 27, 2025 08:00
@snaury snaury requested a review from a team as a code owner February 27, 2025 08:00
@snaury snaury merged commit f1235b7 into ydb-platform:stable-24-4 Feb 27, 2025
6 of 10 checks passed
@snaury snaury deleted the bugfix-KIKIMR-18605-follower-snapshot-24-4 branch February 27, 2025 08:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

2 participants