Detect duplicate rows on each side #850

nolar · 2024-01-11T11:21:07Z

A known flaw: if there are equal duped rows, e.g.:

A: [pk=1000, val=hello], [pk=1000, val=hello] B: [pk=1000, val=hello], [pk=1000, val=hello]

… then we might not notice them even on the level of checksum scanning of table segments. If the segments are fully equal, these dupes will never be yielded, neither with -/+, nor with a potentially different informational marker * introduced specially for dupes. It will only be noticed in segments that have some other (unrelated) differences. Which makes this dupe-detection not fully reliable.

nolar requested a review from dlawin January 11, 2024 11:21

Detect duplicate rows on each side

8944e5f

nolar force-pushed the detect-duplicates branch from 41d71b0 to 8944e5f Compare January 11, 2024 16:45

nolar requested a review from vvkh January 11, 2024 16:45

dlawin approved these changes Jan 11, 2024

View reviewed changes

nolar merged commit f8dd74c into master Jan 11, 2024

nolar deleted the detect-duplicates branch January 11, 2024 18:13

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Detect duplicate rows on each side #850

Detect duplicate rows on each side #850

Uh oh!

nolar commented Jan 11, 2024 •

edited

Loading

Labels

2 participants

Detect duplicate rows on each side #850

Detect duplicate rows on each side #850

Uh oh!

Conversation

nolar commented Jan 11, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Labels

2 participants

nolar commented Jan 11, 2024 •

edited

Loading