Skip to content

Conversation

@davydov-d
Copy link
Contributor

What

https://github.com/airbytehq/oncall/issues/1313

How

Filter out each 10th log record

@octavia-squidington-iv octavia-squidington-iv added area/connectors Connector related issues area/documentation Improvements or additions to documentation connectors/source/google-ads labels Jan 18, 2023
@davydov-d davydov-d requested a review from brianjlai January 18, 2023 12:10
@davydov-d
Copy link
Contributor Author

davydov-d commented Jan 18, 2023

/test connector=connectors/source-google-ads

🕑 connectors/source-google-ads https://github.com/airbytehq/airbyte/actions/runs/3948722563
❌ connectors/source-google-ads https://github.com/airbytehq/airbyte/actions/runs/3948722563
🐛 https://gradle.com/s/kiz4dbndylt5a

Build Failed

Test summary info:

=========================== short test summary info ============================ FAILED test_core.py::TestBasicRead::test_read[inputs0] - Failed: Stream accou... SKIPPED [1] ../usr/local/lib/python3.9/site-packages/source_acceptance_test/tests/test_core.py:94: The previous and actual specifications are identical. SKIPPED [1] ../usr/local/lib/python3.9/site-packages/source_acceptance_test/tests/test_core.py:377: The previous and actual discovered catalogs are identical. ============= 1 failed, 28 passed, 2 skipped in 1608.20s (0:26:48) ============= 
@davydov-d
Copy link
Contributor Author

davydov-d commented Jan 18, 2023

/test connector=connectors/source-google-ads

🕑 connectors/source-google-ads https://github.com/airbytehq/airbyte/actions/runs/3949865210
✅ connectors/source-google-ads https://github.com/airbytehq/airbyte/actions/runs/3949865210
Python tests coverage:

Name Stmts Miss Cover -------------------------------------------------------------- source_google_ads/models.py 18 0 100% source_google_ads/__init__.py 2 0 100% source_google_ads/streams.py 196 6 97% source_google_ads/source.py 86 5 94% source_google_ads/custom_query_stream.py 75 5 93% source_google_ads/google_ads.py 73 12 84% -------------------------------------------------------------- TOTAL 450 28 94% Name Stmts Miss Cover Missing ---------------------------------------------------------------------------------- source_acceptance_test/base.py 12 4 67% 16-19 source_acceptance_test/config.py 141 5 96% 87, 93, 239, 243-244 source_acceptance_test/conftest.py 211 95 55% 36, 42-44, 49, 54, 77, 83, 89-91, 110, 115-117, 123-125, 131-132, 137-138, 143, 149, 158-167, 173-178, 193, 217, 248, 254, 262-267, 275-285, 293-306, 311-317, 324-335, 342-358 source_acceptance_test/plugin.py 69 25 64% 22-23, 31, 36, 120-140, 144-148 source_acceptance_test/tests/test_core.py 402 115 71% 53, 58, 93-104, 109-116, 120-121, 125-126, 308, 346-363, 376-387, 391-396, 402, 435-440, 478-485, 528-530, 533, 598-606, 618-621, 626, 682-683, 689, 692, 728-738, 751-776 source_acceptance_test/tests/test_incremental.py 160 14 91% 58-65, 70-83, 246 source_acceptance_test/utils/asserts.py 39 2 95% 62-63 source_acceptance_test/utils/common.py 94 10 89% 16-17, 32-38, 72, 75 source_acceptance_test/utils/compare.py 62 23 63% 21-51, 68, 97-99 source_acceptance_test/utils/connector_runner.py 133 33 75% 24-27, 46-47, 50-54, 57-58, 73-75, 78-80, 83-85, 88-90, 93-95, 124-125, 159-161, 208 source_acceptance_test/utils/json_schema_helper.py 107 13 88% 30-31, 38, 41, 65-68, 96, 120, 192-194 ---------------------------------------------------------------------------------- TOTAL 1609 339 79% 

Build Passed

Test summary info:

=========================== short test summary info ============================ SKIPPED [1] ../usr/local/lib/python3.9/site-packages/source_acceptance_test/tests/test_core.py:94: The previous and actual specifications are identical. SKIPPED [1] ../usr/local/lib/python3.9/site-packages/source_acceptance_test/tests/test_core.py:377: The previous and actual discovered catalogs are identical. ================== 29 passed, 2 skipped in 1419.82s (0:23:39) ================== 
Copy link
Contributor

@brianjlai brianjlai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few questions about some specifics of how we use the sieve, but overall the approach looks good.


def stream_slices(self, stream_state: Mapping[str, Any] = None, **kwargs) -> Iterable[Optional[MutableMapping[str, any]]]:
for customer in self.customers:
logger = cyclic_sieve(self.logger, 10)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we avoid initializing a new cyclic_sieve since it should have already been defined on the stream as self.incremental_sieve_logger?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Technically we can, but this was done on purpose. The idea was to filter out not just 9 out of 10 messages, but 9 out of 10 message cycles to have more consistent logs so an engineer shouldn't have to stick together messages from different routine calls / cycles / exceptions / if statements and so on when debugging. Since stream_slices() is called once per stream before reading records, and read_records() is called once per stream slice, these are two different log cycles.
If you still insist, I can reuse self.incremental_sieve_logger here, technically it is not a problem

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah got it, that makes sense and I agree that for a better experience we wouldn't want to share the same log cycles. Can you add a comment for that here just mentioning that's why a logger sieve is initialized here.

assert records == []


def test_cyclic_sieve(caplog):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for adding tests for the new logger?

@davydov-d
Copy link
Contributor Author

davydov-d commented Jan 19, 2023

/test connector=connectors/source-google-ads

🕑 connectors/source-google-ads https://github.com/airbytehq/airbyte/actions/runs/3956741759

@davydov-d davydov-d requested a review from brianjlai January 19, 2023 08:42
@davydov-d
Copy link
Contributor Author

davydov-d commented Jan 19, 2023

/test connector=connectors/source-google-ads

🕑 connectors/source-google-ads https://github.com/airbytehq/airbyte/actions/runs/3957323653
✅ connectors/source-google-ads https://github.com/airbytehq/airbyte/actions/runs/3957323653
Python tests coverage:

Name Stmts Miss Cover -------------------------------------------------------------- source_google_ads/models.py 18 0 100% source_google_ads/__init__.py 2 0 100% source_google_ads/streams.py 195 6 97% source_google_ads/source.py 86 5 94% source_google_ads/custom_query_stream.py 75 5 93% source_google_ads/google_ads.py 73 12 84% -------------------------------------------------------------- TOTAL 449 28 94% Name Stmts Miss Cover Missing ---------------------------------------------------------------------------------- source_acceptance_test/base.py 12 4 67% 16-19 source_acceptance_test/config.py 141 5 96% 87, 93, 239, 243-244 source_acceptance_test/conftest.py 211 95 55% 36, 42-44, 49, 54, 77, 83, 89-91, 110, 115-117, 123-125, 131-132, 137-138, 143, 149, 158-167, 173-178, 193, 217, 248, 254, 262-267, 275-285, 293-306, 311-317, 324-335, 342-358 source_acceptance_test/plugin.py 69 25 64% 22-23, 31, 36, 120-140, 144-148 source_acceptance_test/tests/test_core.py 402 115 71% 53, 58, 93-104, 109-116, 120-121, 125-126, 308, 346-363, 376-387, 391-396, 402, 435-440, 478-485, 528-530, 533, 598-606, 618-621, 626, 682-683, 689, 692, 728-738, 751-776 source_acceptance_test/tests/test_incremental.py 160 14 91% 58-65, 70-83, 246 source_acceptance_test/utils/asserts.py 39 2 95% 62-63 source_acceptance_test/utils/common.py 94 10 89% 16-17, 32-38, 72, 75 source_acceptance_test/utils/compare.py 62 23 63% 21-51, 68, 97-99 source_acceptance_test/utils/connector_runner.py 133 33 75% 24-27, 46-47, 50-54, 57-58, 73-75, 78-80, 83-85, 88-90, 93-95, 124-125, 159-161, 208 source_acceptance_test/utils/json_schema_helper.py 107 13 88% 30-31, 38, 41, 65-68, 96, 120, 192-194 ---------------------------------------------------------------------------------- TOTAL 1609 339 79% 

Build Passed

Test summary info:

=========================== short test summary info ============================ SKIPPED [1] ../usr/local/lib/python3.9/site-packages/source_acceptance_test/tests/test_core.py:94: The previous and actual specifications are identical. SKIPPED [1] ../usr/local/lib/python3.9/site-packages/source_acceptance_test/tests/test_core.py:377: The previous and actual discovered catalogs are identical. ================== 29 passed, 2 skipped in 1699.95s (0:28:19) ================== 
Copy link
Contributor

@brianjlai brianjlai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall no blocking comments, but had one discussion point about whether we want a new sieve per-read_records invocation if that made interpreting the logs more intuitive


def stream_slices(self, stream_state: Mapping[str, Any] = None, **kwargs) -> Iterable[Optional[MutableMapping[str, any]]]:
for customer in self.customers:
logger = cyclic_sieve(self.logger, 10)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah got it, that makes sense and I agree that for a better experience we wouldn't want to share the same log cycles. Can you add a comment for that here just mentioning that's why a logger sieve is initialized here.

"""
while True:
self.logger.info("Starting a while loop iteration")
self.incremental_sieve_logger.bump()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Going off the prior comment about separate slice and read_records log cycles, do you think it's also worth initializing a new sieve every time read_records is invoked? If the sieve is shared on the stream, we will emit exactly 1/10 of logs, but it could be a little confusing why certain cycles are skipped on different read_records invocations if a prior one has a partially bumped count.

I'm open to either option and this is non-blocking on approval, but I think worth highlighting when the sieve is shared at this point? If we do decide to have a separate sieve per read_records invocation, we could probably remove it from __init__() and do something similar to slices where it is created within this method.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a tricky point cause normally a while loop here contains a single iteration. It only goes further if there were errors during the read. So I do not think there's much sense in initializing new sieve each time read_records is invoked. On the other hand I agree that logs may become inconsistent if there are two or more while loop iterations, therefore I moved incremental_sieve_logger.bump() outside the loop to remain consistent

@davydov-d
Copy link
Contributor Author

davydov-d commented Jan 20, 2023

/publish connector=connectors/source-google-ads

🕑 Publishing the following connectors:
connectors/source-google-ads
https://github.com/airbytehq/airbyte/actions/runs/3967044312


Connector Did it publish? Were definitions generated?
connectors/source-google-ads

if you have connectors that successfully published but failed definition generation, follow step 4 here ▶️

@octavia-squidington-iii octavia-squidington-iii temporarily deployed to more-secrets January 20, 2023 11:47 — with GitHub Actions Inactive
@octavia-squidington-iii octavia-squidington-iii temporarily deployed to more-secrets January 20, 2023 11:47 — with GitHub Actions Inactive
@github-actions
Copy link
Contributor

github-actions bot commented Jan 20, 2023

Airbyte Code Coverage

There is no coverage information present for the Files changed

Total Project Coverage 26.72% 🍏
@davydov-d davydov-d temporarily deployed to more-secrets January 20, 2023 12:05 — with GitHub Actions Inactive
@davydov-d davydov-d temporarily deployed to more-secrets January 20, 2023 12:05 — with GitHub Actions Inactive
@davydov-d davydov-d temporarily deployed to more-secrets January 20, 2023 12:38 — with GitHub Actions Inactive
@davydov-d davydov-d temporarily deployed to more-secrets January 20, 2023 12:38 — with GitHub Actions Inactive
@davydov-d davydov-d temporarily deployed to more-secrets January 20, 2023 12:49 — with GitHub Actions Inactive
@davydov-d davydov-d temporarily deployed to more-secrets January 20, 2023 12:49 — with GitHub Actions Inactive
@davydov-d davydov-d temporarily deployed to more-secrets January 20, 2023 13:51 — with GitHub Actions Inactive
@davydov-d davydov-d temporarily deployed to more-secrets January 20, 2023 13:51 — with GitHub Actions Inactive
@davydov-d davydov-d temporarily deployed to more-secrets January 20, 2023 15:17 — with GitHub Actions Inactive
@davydov-d davydov-d temporarily deployed to more-secrets January 20, 2023 15:17 — with GitHub Actions Inactive
@davydov-d davydov-d temporarily deployed to more-secrets January 20, 2023 17:58 — with GitHub Actions Inactive
@davydov-d davydov-d temporarily deployed to more-secrets January 20, 2023 17:58 — with GitHub Actions Inactive
@davydov-d davydov-d temporarily deployed to more-secrets January 23, 2023 07:40 — with GitHub Actions Inactive
@davydov-d davydov-d temporarily deployed to more-secrets January 23, 2023 07:40 — with GitHub Actions Inactive
@davydov-d davydov-d merged commit e35dc23 into master Jan 23, 2023
@davydov-d davydov-d deleted the ddavydov/#1313-source-google-ads-write-fewer-logs branch January 23, 2023 08:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/connectors Connector related issues area/documentation Improvements or additions to documentation connectors/source/google-ads

5 participants