Skip to content

Conversation

@pengpeng-lu
Copy link
Contributor

@pengpeng-lu pengpeng-lu commented Jun 13, 2025

Fixes issue: #3206

Instead of returning the innerContinuation byte array, serializes into a protobuf, with an isEnd flag.

There will be 2 steps:

  1. still serializes to the old format, being able to deserialize from both old and new format;
  2. serializes to the new format.

This fixes #3206

@pengpeng-lu pengpeng-lu added the bug fix Change that fixes a bug label Jun 17, 2025
@pengpeng-lu pengpeng-lu changed the title Keyvalue cursor Fix bug of scan aggregate index returning empty non-end continuation Jun 17, 2025
- query: select count(col1) from t2
- explain: "ISCAN(MV5 <,>) | MAP (_ AS _0) | AGG (count(_._0.COL1) AS _0) | ON EMPTY NULL | MAP (coalesce_long(_._0._0, promote(0l AS LONG)) AS _0)"
# Cannot run with FORCE_CONTINUATIONS due to: https://github.com/FoundationDB/fdb-record-layer/issues/3206
- maxRows: 0
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These 2 were also mis-categorized to issue 3206.

@pengpeng-lu pengpeng-lu marked this pull request as ready for review June 17, 2025 05:29
@pengpeng-lu pengpeng-lu requested a review from alecgrieser June 17, 2025 05:29
Copy link
Collaborator

@alecgrieser alecgrieser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the delay in getting to reviewing this!

Copy link
Collaborator

@alecgrieser alecgrieser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We had a discussion offline, and I think we've converged on a strategy here.

As @pengpeng-lu points out, there are a few too many places where the key value cursor is created directly. We could theoretically limit the fix to just the two plan types, and have them manage wrapping and unwrapping the cursor, but then all users of the KeyValueCursor would have to worry about scans over single key ranges leading to an empty continuation on the first element.

For that reason, we're thinking that we have to go with an approach where the KeyValueCursor is the one that can accept either an old or new style continuation, and it just has to parse it to see if it fits. However, this is dangerous as some old continuations could be parsed as a new continuation. To avoid that, we want to insert a magic number into the new protobuf continuation, and then we will validate that the number exactly matches. Any continuation without the magic number must then be from an older instance even if it parses correctly.

If we take this approach, we can "just" put this into the KeyValueCursor, and then remove the changes to the plan serialization. Step one will put out a version that can read the new continuations. Then the next version, we make the new continuations by default. A future version can then remove the compatibility mode.

This is actually pretty close to the original role out plan, but we think that having the extra magic number check makes us less likely to misinterpret a continuation, which can lead to reading incorrect data.

alecgrieser added a commit to alecgrieser/fdb-record-layer that referenced this pull request Sep 15, 2025
This adds test cases to cover situations where we might have a byte array that could plausibly be either a `Tuple` or a `Protobuf` message. It enumerates through the `Tuple` types, and in each case, it either identifies why the `Tuple` code could not be a valid message, or it constructs a case that could. The upshot is that it's clear that parsing as Protobuf without an error is not enough to say that something was definitely a Protobuf, though the stars do kind of need to align for this to happen. Part of my hope here is that we can use this reference whenever this kind of situation comes up. For example, in our current thinking for FoundationDB#3397, we'll want to employ a strategy of guessing the origin of a byte string. If we want to test what happens if we start with a `Tuple` and then try to parse it as a Protobuf, this could be used to inform such a test case.
alecgrieser added a commit to alecgrieser/fdb-record-layer that referenced this pull request Sep 15, 2025
This adds test cases to cover situations where we might have a byte array that could plausibly be either a `Tuple` or a `Protobuf` message. It enumerates through the `Tuple` types, and in each case, it either identifies why the `Tuple` code could not be a valid message, or it constructs a case that could. The upshot is that it's clear that parsing as Protobuf without an error is not enough to say that something was definitely a Protobuf, though the stars do kind of need to align for this to happen. Part of my hope here is that we can use this reference whenever this kind of situation comes up. For example, in our current thinking for FoundationDB#3397, we'll want to employ a strategy of guessing the origin of a byte string. If we want to test what happens if we start with a `Tuple` and then try to parse it as a Protobuf, this could be used to inform such a test case.
alecgrieser added a commit to alecgrieser/fdb-record-layer that referenced this pull request Sep 15, 2025
This adds test cases to cover situations where we might have a byte array that could plausibly be either a `Tuple` or a `Protobuf` message. It enumerates through the `Tuple` types, and in each case, it either identifies why the `Tuple` code could not be a valid message, or it constructs a case that could. The upshot is that it's clear that parsing as Protobuf without an error is not enough to say that something was definitely a Protobuf, though the stars do kind of need to align for this to happen. Part of my hope here is that we can use this reference whenever this kind of situation comes up. For example, in our current thinking for FoundationDB#3397, we'll want to employ a strategy of guessing the origin of a byte string. If we want to test what happens if we start with a `Tuple` and then try to parse it as a Protobuf, this could be used to inform such a test case.
Copy link
Collaborator

@alecgrieser alecgrieser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, LGTM. I left one comment about how I still think it would be good to remove PrefixRemovingContinuation and just use KeyValueCursorBase.Continuation, but I'm also fine with how it is in the PR now if we want to hold off on that. The only remaining thing is that as far as I can tell, there's still one test case that was removed from disabled-planner-rewrites/aggregate-index-tests.yamsql, and it seems like we should add it back

@alecgrieser alecgrieser merged commit 53409f4 into FoundationDB:main Sep 23, 2025
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug fix Change that fixes a bug

2 participants