Skip to content

Conversation

@snaury
Copy link
Member

@snaury snaury commented Oct 29, 2024

Changelog entry

Fixed excessive read latency during and after some shard splits.

Changelog category

  • Bugfix

Additional information

It was observed that reads sometimes take seconds during frequent shard splits. Turns out shards replied with an OVERLOADED status even after split has already finished, which caused KQP to retry reads repeatedly with an exponential backoff, until eventually a guard condition (after multiple seconds) would cause read actor to finally re-resolve the table again. A correct NOT_FOUND status (which indicates the table no longer exists) fixes this problem.

Fixes #11036.

@snaury snaury requested a review from azevaykin October 29, 2024 13:25
@github-actions github-actions bot added bugfix and removed bugfix labels Oct 29, 2024
@snaury snaury marked this pull request as ready for review October 29, 2024 13:26
@github-actions github-actions bot added bugfix and removed bugfix labels Oct 29, 2024
@snaury snaury self-assigned this Oct 29, 2024
@github-actions
Copy link

github-actions bot commented Oct 29, 2024

2024-10-29 13:27:57 UTC Pre-commit check linux-x86_64-relwithdebinfo for dfa7e01 has started.
2024-10-29 13:28:33 UTC Artifacts will be uploaded here
2024-10-29 13:32:01 UTC ya make is running...
🟡 2024-10-29 14:37:41 UTC Some tests failed, follow the links below. Going to retry failed tests...

Test history | Ya make output | Test bloat

TESTS PASSED ERRORS FAILED SKIPPED MUTED?
15285 13784 0 2 1396 103

2024-10-29 14:38:58 UTC ya make is running... (failed tests rerun, try 2)
🟢 2024-10-29 14:50:31 UTC Tests successful.

Test history | Ya make output | Test bloat | Test bloat

TESTS PASSED ERRORS FAILED SKIPPED MUTED?
106 (only retried tests) 13 0 0 0 93

🟢 2024-10-29 14:50:38 UTC Build successful.
🟡 2024-10-29 14:50:57 UTC ydbd size 2.8 GiB changed* by +1.2 MiB, which is >= 100.0 KiB vs main: Warning

ydbd size dash main: 103d800 merge: dfa7e01 diff diff %
ydbd size 3 034 031 192 Bytes 3 035 250 456 Bytes +1.2 MiB +0.040%
ydbd stripped size 480 686 232 Bytes 480 862 936 Bytes +172.6 KiB +0.037%

*please be aware that the difference is based on comparing your commit and the last completed build from the post-commit, check comparation

@github-actions
Copy link

github-actions bot commented Oct 29, 2024

2024-10-29 13:29:41 UTC Pre-commit check linux-x86_64-release-asan for dfa7e01 has started.
2024-10-29 13:29:52 UTC Artifacts will be uploaded here
2024-10-29 13:32:52 UTC ya make is running...
🟡 2024-10-29 15:02:54 UTC Some tests failed, follow the links below. This fail is not in blocking policy yet

Test history | Ya make output | Test bloat

TESTS PASSED ERRORS FAILED SKIPPED MUTED?
9229 9158 0 24 13 34

🟢 2024-10-29 15:03:42 UTC Build successful.
🟢 2024-10-29 15:04:16 UTC ydbd size 5.7 GiB changed* by -2.4 KiB, which is <= 0 Bytes vs main: OK

ydbd size dash main: 06b8cb8 merge: dfa7e01 diff diff %
ydbd size 6 142 267 184 Bytes 6 142 264 688 Bytes -2.4 KiB -0.000%
ydbd stripped size 1 532 866 544 Bytes 1 532 866 736 Bytes +192 Bytes +0.000%

*please be aware that the difference is based on comparing your commit and the last completed build from the post-commit, check comparation

@snaury snaury merged commit eee456c into ydb-platform:main Oct 29, 2024
13 checks passed
@snaury snaury deleted the bugfix-11036-slow-read-split branch October 29, 2024 15:14
This was referenced Nov 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

2 participants