fix: Address assertion error in TestReadRows_Retry_LastScannedRow conformance test #1521
Add this suggestion to a batch that can be applied as a single commit. This suggestion is invalid because no changes were made to the code. Suggestions cannot be applied while the pull request is closed. Suggestions cannot be applied while viewing a subset of changes. Only one suggestion per line can be applied in a batch. Add this suggestion to a batch that can be applied as a single commit. Applying suggestions on deleted lines is not supported. You must change the existing code in this line in order to create a valid suggestion. Outdated suggestions cannot be applied. This suggestion has been applied or marked resolved. Suggestions cannot be applied from pending reviews. Suggestions cannot be applied on multi-line comments. Suggestions cannot be applied while the pull request is queued to merge. Suggestion cannot be applied right now. Please check back later.
Summary:
The
TestReadRows_Retry_LastScannedRowconformance test is failing because the client retries requesting more rows than it should. The code changes in this PR introduce more technical debt, but provide a solution so that the test passes.Background:
In the conformance test, the mock server sends back three data events. The first and the third data events contain rows with ids equaling 'abar' and 'zbar' respectively while the second event contains no rows. The second event however has a lastScannedRow property set to 'qfoo'. The retry is made for rows after 'abar', but the test is expecting a retry for rows after 'qfoo' so the test fails.
Retries are now made for data after the
lastRowKeywhich is updated in the user stream ever since the fix to the data loss issue. The second data event containing thelastScannedRowproperty equal toqfooreaches the chunk transformer, but doesn't do anything to update thelastRowKeyvariable mentioned in the user stream. This PR is a fix to ensure thelastRowKeyis updated andqfoois not re-requested and it does this by emitting a new event from the chunk transformer.All of the source code changes in this PR are for a streaming pipeline that services ReadRows calls. Here is the order of the streams in the pipeline for context:
server -> request stream -> chunk transformer -> toRowStream -> rowStream -> user stream
Here is why we emit an event in the chunk transformer to change the lastRowKey:
When the chunk transformer receives a data event with chunks of data then we have to ensure those chunks will make it to the user before the lastRowKey is changed based on the
lastScannedRowproperty. For example, say the data event has data chunks where chunk 1 is [row1, row2, row3] and chunk 2 is [row4, row5, row6]. Suppose the data event also has lastScannedRowKey equal to row 8 and then receives a retryable error when only row 1 and row 2 have reached the user stream. In this case, all rows except row 1 and row 2 will be thrown away and should be re-requested in the next retry. In this example, it is important thatlastRowKeyis not set to row 8 yet. To ensure rows thrown away by the client always get re-requested, the chunk transformer emits an event to change thelastRowKeydownstream after all rows in chunks are delivered to the user stream.Changes:
src/chunktransformer.ts:When the chunk transformer receives a lastScannedRow then it pushes a
LAST_ROW_KEY_UPDATEevent to update thelastRowKeydownstream in the user stream. It does not attempt to update thelastRowKeyin the chunk transformer because it is important that all data ahead of the lastScannedRow data makes it to the user stream first. Therefore, this event gets queued behind all data that has already been sent eventually to be consumed by the user stream.src/mutation.ts:The linter complained about this file so a single cast is added
src/tabular-api-surface.ts:lastScannedRowdata and then emitted an event with the intent of updating thelastRowKeyin the user stream.testproxy/known_failures.txt:The changes address the
TestReadRows_Retry_LastScannedRowtest so this test is removed from the list.Next Steps: