Skip to content

Conversation

maaarcelino
Copy link
Contributor

@maaarcelino maaarcelino commented May 15, 2025

The current logic for batch writing prevents us from cleanly testing the DeleteRecord handling in the plugin. Currently we have to set BatchSize: 1 and send two DeleteRecord messages in the test, which is not the cleanest logic. This PR changes the logic so the batch is flushed when the batch size is reached, not exceeded (so for BatchSize: 1 it will flush after one DeleteRecord is received).

This also matches the logic we have in https://github.com/cloudquery/plugin-sdk/blob/main/internal/batch/cap.go#L7.

@maaarcelino maaarcelino requested review from a team and savme May 15, 2025 13:25
@github-actions github-actions bot added feat and removed feat labels May 15, 2025
Copy link
Contributor

@murarustefaan murarustefaan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't the described behavior be fixed by running tests in destinations with lower batch sizes / timeouts, rather than configuring the same delete message to be sent more times than it's needed?

@maaarcelino maaarcelino force-pushed the feat/delete-test-batch branch from 8daf1f9 to b95f7ee Compare May 16, 2025 09:05
@maaarcelino
Copy link
Contributor Author

@murarustefaan Yeah I guess that's a fair point. I updated the PR to change the logic for batch writers to use >= instead of > for comparisons against the batch size. This is to enable running the WriterTestSuiteTests test suite which sends a single DeleteRecord message to the plugin. We need to set the batch size to 1 in the test and to test that we have to allow for using >= in the logic. Does that make sense?

Copy link
Contributor

@murarustefaan murarustefaan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I don't think this will impact anything, it will flush at 1 instead of 2 for batch_size: 1and at 5000 instead of 5001 at batch_size: 5000

@murarustefaan murarustefaan requested a review from erezrokah May 16, 2025 10:26
@murarustefaan
Copy link
Contributor

Could you change the title and description of this PR, @maaarcelino ?

@maaarcelino maaarcelino changed the title feat: Allow for DeleteRecord tests to work with batch writers fix: Change logic for batch writing to write when batch size is reached, not exceeded May 16, 2025
@github-actions github-actions bot added fix and removed feat fix labels May 16, 2025
Copy link
Member

@erezrokah erezrokah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree with the change, seems like a left over from when we refactored the code to support multi row batches.

I would first try to update a few destinations (e.g. Postgres, S3 and MySQL) to this branch sha to see if this triggers any test failures, as existing tests might rely on this behavior

l := int64(len(w.deleteRecordMessages))
w.deleteRecordLock.Unlock()
if w.batchSize > 0 && l > w.batchSize {
if w.batchSize > 0 && l >= w.batchSize {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if w.batchSize > 0 && l >= w.batchSize {
limit := batch.CappedAt(0, w.batchSize)
limit.AddRows(l)
if limit.ReachedLimit() {

I think this is more aligned with the rest of the codebase e.g.

if len(toFlush) > 0 || rest != nil || s.limit.ReachedLimit() {

same comment for other places where we check the limit

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in 05b4212. Tested and it works correctly

@kodiakhq kodiakhq bot merged commit 58c8a1e into main May 19, 2025
11 checks passed
@kodiakhq kodiakhq bot deleted the feat/delete-test-batch branch May 19, 2025 10:39
kodiakhq bot pushed a commit that referenced this pull request May 19, 2025
🤖 I have created a release *beep* *boop* --- ## [4.80.2](v4.80.1...v4.80.2) (2025-05-19) ### Bug Fixes * Change logic for batch writing to write when batch size is reached, not exceeded ([#2153](#2153)) ([58c8a1e](58c8a1e)) * Flush DeleteRecord messages when batch writer is flushed ([#2154](#2154)) ([791c865](791c865)) --- This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

3 participants