- Notifications
You must be signed in to change notification settings - Fork 4.9k
✨29779 source postgres slow ctid read seen on customer connection #30125
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
✨29779 source postgres slow ctid read seen on customer connection #30125
Conversation
# Conflicts: # airbyte-integrations/connectors/source-postgres/src/main/java/io/airbyte/integrations/source/postgres/PostgresQueryUtils.java # airbyte-integrations/connectors/source-postgres/src/main/java/io/airbyte/integrations/source/postgres/PostgresSource.java # airbyte-integrations/connectors/source-postgres/src/main/java/io/airbyte/integrations/source/postgres/cdc/PostgresCdcCtidInitializer.java # airbyte-integrations/connectors/source-postgres/src/main/java/io/airbyte/integrations/source/postgres/ctid/CtidGlobalStateManager.java # airbyte-integrations/connectors/source-postgres/src/main/java/io/airbyte/integrations/source/postgres/ctid/CtidPerStreamStateManager.java # airbyte-integrations/connectors/source-postgres/src/main/java/io/airbyte/integrations/source/postgres/ctid/CtidStateManager.java # airbyte-integrations/connectors/source-postgres/src/main/java/io/airbyte/integrations/source/postgres/ctid/PostgresCtidHandler.java # airbyte-integrations/connectors/source-postgres/src/main/java/io/airbyte/integrations/source/postgres/cursor_based/CursorBasedCtidUtils.java # airbyte-integrations/connectors/source-postgres/src/main/java/io/airbyte/integrations/source/postgres/xmin/XminCtidUtils.java # airbyte-integrations/connectors/source-postgres/src/test/java/io/airbyte/integrations/source/postgres/ctid/PostgresCtidHandlerTest.java
Before Merging a Connector Pull RequestWow! What a great pull request you have here! 🎉 To merge this PR, ensure the following has been done/considered for each connector added or updated:
If the checklist is complete, but the CI check is failing,
|
Coverage report for source-postgres
|
prateekmukhedkar left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rodireich i realized now you don't have changes to release a new version of source-postgres connector. Were you planning to do it in a different PR?
| @prateekmukhedkar did you mean an bumped version? |
…on-customer-connection
Tests like PostgresSourceAcceptanceLegacyCtidTest run the same test flows for with this new flow.
|
| Step | Result |
|---|---|
| Build connector tar | ✅ |
| Build source-postgres docker image for platform linux/x86_64 | ✅ |
| Java Connector Unit Tests | ✅ |
| Java Connector Integration Tests | ✅ |
| Acceptance tests | ✅ |
| Validate airbyte-integrations/connectors/source-postgres/metadata.yaml | ✅ |
| Connector version semver check | ✅ |
| Connector version increment check | ❌ |
| QA checks | ✅ |
☁️ View runs for commit in Dagger Cloud
Please note that tests are only run on PR ready for review. Please set your PR to draft mode to not flood the CI engine and upstream service on following commits.
You can run the same pipeline locally on this branch with the airbyte-ci tool with the following command
airbyte-ci connectors --name=source-postgres test |
| Step | Result |
|---|---|
| Build connector tar | ✅ |
| Build source-postgres docker image for platform linux/x86_64 | ✅ |
| Java Connector Unit Tests | ✅ |
| Java Connector Integration Tests | ✅ |
| Acceptance tests | ✅ |
| Validate airbyte-integrations/connectors/source-postgres/metadata.yaml | ✅ |
| Connector version semver check | ✅ |
| Connector version increment check | ✅ |
| QA checks | ✅ |
☁️ View runs for commit in Dagger Cloud
Please note that tests are only run on PR ready for review. Please set your PR to draft mode to not flood the CI engine and upstream service on following commits.
You can run the same pipeline locally on this branch with the airbyte-ci tool with the following command
airbyte-ci connectors --name=source-postgres test |
| Step | Result |
|---|---|
| Build connector tar | ✅ |
| Build source-postgres-strict-encrypt docker image for platform linux/x86_64 | ✅ |
| Java Connector Unit Tests | ✅ |
| Java Connector Integration Tests | ✅ |
| Acceptance tests | ✅ |
| Validate airbyte-integrations/connectors/source-postgres-strict-encrypt/metadata.yaml | ✅ |
| Connector version semver check | ✅ |
| Connector version increment check | ✅ |
| QA checks | ✅ |
☁️ View runs for commit in Dagger Cloud
Please note that tests are only run on PR ready for review. Please set your PR to draft mode to not flood the CI engine and upstream service on following commits.
You can run the same pipeline locally on this branch with the airbyte-ci tool with the following command
airbyte-ci connectors --name=source-postgres-strict-encrypt test |
| Step | Result |
|---|---|
| Build connector tar | ✅ |
| Build source-alloydb docker image for platform linux/x86_64 | ✅ |
| Java Connector Unit Tests | ✅ |
| Java Connector Integration Tests | ✅ |
| Acceptance tests | ❌ |
| Validate airbyte-integrations/connectors/source-alloydb/metadata.yaml | ✅ |
| Connector version semver check | ✅ |
| Connector version increment check | ✅ |
| QA checks | ✅ |
☁️ View runs for commit in Dagger Cloud
Please note that tests are only run on PR ready for review. Please set your PR to draft mode to not flood the CI engine and upstream service on following commits.
You can run the same pipeline locally on this branch with the airbyte-ci tool with the following command
airbyte-ci connectors --name=source-alloydb test |
| Step | Result |
|---|---|
| Build connector tar | ✅ |
| Build source-postgres docker image for platform linux/x86_64 | ✅ |
| Java Connector Unit Tests | ✅ |
| Java Connector Integration Tests | ✅ |
| Acceptance tests | ✅ |
| Validate airbyte-integrations/connectors/source-postgres/metadata.yaml | ✅ |
| Connector version semver check | ✅ |
| Connector version increment check | ✅ |
| QA checks | ✅ |
☁️ View runs for commit in Dagger Cloud
Please note that tests are only run on PR ready for review. Please set your PR to draft mode to not flood the CI engine and upstream service on following commits.
You can run the same pipeline locally on this branch with the airbyte-ci tool with the following command
airbyte-ci connectors --name=source-postgres test |
| Step | Result |
|---|---|
| Build connector tar | ✅ |
| Build source-alloydb docker image for platform linux/x86_64 | ✅ |
| Java Connector Unit Tests | ✅ |
| Java Connector Integration Tests | ✅ |
| Acceptance tests | ❌ |
| Validate airbyte-integrations/connectors/source-alloydb/metadata.yaml | ✅ |
| Connector version semver check | ✅ |
| Connector version increment check | ✅ |
| QA checks | ✅ |
☁️ View runs for commit in Dagger Cloud
Please note that tests are only run on PR ready for review. Please set your PR to draft mode to not flood the CI engine and upstream service on following commits.
You can run the same pipeline locally on this branch with the airbyte-ci tool with the following command
airbyte-ci connectors --name=source-alloydb test| /approve-and-merge reason="AlloyDB acceptance test failing in CI" |
|
| Step | Result |
|---|---|
| Build connector tar | ✅ |
| Build source-postgres-strict-encrypt docker image for platform linux/x86_64 | ✅ |
| Java Connector Unit Tests | ✅ |
| Java Connector Integration Tests | ✅ |
| Acceptance tests | ✅ |
| Validate airbyte-integrations/connectors/source-postgres-strict-encrypt/metadata.yaml | ✅ |
| Connector version semver check | ✅ |
| Connector version increment check | ✅ |
| QA checks | ✅ |
☁️ View runs for commit in Dagger Cloud
Please note that tests are only run on PR ready for review. Please set your PR to draft mode to not flood the CI engine and upstream service on following commits.
You can run the same pipeline locally on this branch with the airbyte-ci tool with the following command
airbyte-ci connectors --name=source-postgres-strict-encrypt test…0125) Co-authored-by: subodh <subodh1810@gmail.com> Co-authored-by: subodh1810 <subodh1810@users.noreply.github.com> Co-authored-by: rodireich <rodireich@users.noreply.github.com>

What
Postgres of versions before 14 are unable to run TID range scan queries that are running when we query with
WHERE ctid > '(x,y)'.As a result every time we query for a chunk of data postgres needs to run a full scan of the entire table. This leads to a considerably slow performance.
In order to avoid this issue we need to query postgres 12, 13 in a way that will do a TID scan (no "range")
See full discussion in this [doc]
How
Instead of querying for a range of ctid (
WHERE ctid > '(x,y)'),We are going to query for a list of ctid's.
Because there is no range of ctid, we need to know the last possible tuple in a page.
Recommended reading order
InitialSyncCtidIterator.javaPostgresQueryUtils.javaPostgresCtidHandler.java🚨 User Impact 🚨
User should see quicker initial sync of tables on all 3 sync modes (CDC, cursor, xmin).
Previous syncs that started with the old algorithm will seamlessly work faster upon another attempt.
No breaking changes.