Skip to content

Conversation

@vladimir-remar
Copy link
Contributor

What

In our Facebook Marketing connection when we want to do a full refresh sync, this probably will take more than 1 day , the connector reach the rate limite often and it seems the default sleep time it is not enough.

I attached a image when the connector reaches the 100%
Screenshot from 2022-02-28 12-07-28

How

Set sleeps time to 1min between 90%-95% of usage and set to 5min when the usage is greater than or equal 95%

🚨 User Impact 🚨

Increase of time in connection runs, mostly in a full refresh sync mode, due to this new approach.

Pre-merge Checklist

Expand the relevant checklist and delete the others.

Updating a connector

Community member or Airbyter

  • Grant edit access to maintainers (instructions)
  • Secrets in the connector's spec are annotated with airbyte_secret
  • Unit & integration tests added and passing. Community members, please provide proof of success locally e.g: screenshot or copy-paste unit, integration, and acceptance test output. To run acceptance tests for a Python connector, follow instructions in the README. For java connectors run ./gradlew :airbyte-integrations:connectors:<name>:integrationTest.
  • Code reviews completed
  • Documentation updated
    • Connector's README.md
    • Connector's bootstrap.md. See description and examples
    • Changelog updated in docs/integrations/<source or destination>/<name>.md including changelog. See changelog example
  • PR name follows PR naming conventions

Airbyter

If this is a community PR, the Airbyte engineer reviewing this PR is responsible for the below items.

  • Create a non-forked branch based on this PR and test the below items on it
  • Build is successful
  • Credentials added to Github CI. Instructions.
  • /test connector=connectors/<name> command is passing
  • New Connector version released on Dockerhub by running the /publish command described here
  • After the new connector version is published, connector version bumped in the seed directory as described here
  • Seed specs have been re-generated by building the platform and committing the changes to the seed spec files, as described here

Tests

Unit

Put your unit tests output here.

Integration

Put your integration tests output here.

Acceptance

Put your acceptance tests output here.

@github-actions github-actions bot added the area/connectors Connector related issues label Feb 28, 2022
@marcosmarxm
Copy link
Contributor

@vladimir-remar it's possible to run a sync comparing the current method and the method proposed by you? Or, there is any docs in Facebook maybe explaining about the best approach to do rate limit waiting?

@vladimir-remar
Copy link
Contributor Author

vladimir-remar commented Mar 2, 2022

@vladimir-remar it's possible to run a sync comparing the current method and the method proposed by you? Or, there is any docs in Facebook maybe explaining about the best approach to do rate limit waiting?

Hello @marcosmarxm Thanks for the answer. We did a sync with (ads_insights, ads_insights_country and ads_insights_platform_and_device), I attached the log, as you will see, with my suggestion the sync will keeps around 90%-94% of usage pausing 1min between requests until usage is low. This increase the success chances otherwise it will fail.
About the Facebook documentation, there is not much to do, you guys already did most of the work with rate liminting.

logs-17365.txt

@alafanechere alafanechere self-assigned this Mar 4, 2022
Copy link
Contributor

@alafanechere alafanechere left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @vladimir-remar, thank you for this contribution.
I made several suggestions 😄, let me know what you think!

Could you please also:

  • bump the connector version in the dockerfile and in airbyte-config/init/src/main/resources/seed/source_definitions.yaml
  • update the changelog in docs/integrations/sources/facebook-marketing.md
Comment on lines 33 to 36
call_rate_threshold = 95 # maximum percentage of call limit utilization
min_call_rate_threslold = 90 # minimum percentage of call limit utilization
pause_interval_minimum = pendulum.duration(minutes=1) # default pause interval if reached or close to call rate limit
pause_interval_minimum_at_max_usage = pendulum.duration(minutes=5)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
call_rate_threshold = 95 # maximum percentage of call limit utilization
min_call_rate_threslold = 90 # minimum percentage of call limit utilization
pause_interval_minimum = pendulum.duration(minutes=1) # default pause interval if reached or close to call rate limit
pause_interval_minimum_at_max_usage = pendulum.duration(minutes=5)
MAX_RATE, MAX_PAUSE_INTERVAL = (95, pendulum.duration(minutes=5))
MIN_RATE, MAX_PAUSE_INTERVAL = (90, pendulum.duration(minutes=1)
max_pause_interval = max(max_pause_interval, pause_interval)

if max_usage > self.call_rate_threshold:
if max_usage >= self.call_rate_threshold:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if max_usage >= self.call_rate_threshold:
max_rate, max_sleep_duration = self. min_max_rate_sleep_time["max"]
min_rate, min_sleep_duration = self. min_max_rate_sleep_time["min"]
if max_usage >= max_rate:

@staticmethod
def sleep_time_handler(usage, pause_interval: pendulum.duration):

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the definition of the custom pause interval according to usage should be determined in this function and not in handle_call_rate_limit.
I'd rename it to compute_pause_interval and make it return the total seconds to sleep. Then I'd call sleep with this value from handle_call_rate_limit. I think it would greatly improve the readability and maintainability of this logic.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alafanechere Thanks, I did some changes, tell me what you think.

@github-actions github-actions bot added the area/documentation Improvements or additions to documentation label Mar 7, 2022
@vladimir-remar
Copy link
Contributor Author

Hello @alafanechere let me know if there is something I can do to improve this PR, this change would be a big step for us.

@alafanechere
Copy link
Contributor

/test connector=connectors/source-facebook-marketing

Copy link
Contributor

@alafanechere alafanechere left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @vladimir-remar for the changes, I still have an important suggestion. Could you please also bump the version in the Dockerfile?

@alafanechere
Copy link
Contributor

@vladimir-remar could you please fix the conflicts and share the output of your acceptance tests:
./gradlew :airbyte-integrations:connectors:source-facebook-marketing:integrationTest

@vladimir-remar
Copy link
Contributor Author

@vladimir-remar could you please fix the conflicts and share the output of your acceptance tests: ./gradlew :airbyte-integrations:connectors:source-facebook-marketing:integrationTest

Hello @alafanechere the problem here is the quota from the credentials that I have is exhauted, but even with quota I can't inccour in a full refresh run. I'm open to suggestions.

Copy link
Contributor

@marcosmarxm marcosmarxm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

small fix and it's ready to merge!

Copy link
Contributor

@alafanechere alafanechere left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the changes you made @vladimir-remar. I'm going to run the acceptance tests and merge if they pass.

Comment on lines 97 to 100
usage = 0
pause_interval = self.MIN_PAUSE_INTERVAL

if "batch" in params:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit:

Suggested change
usage = 0
pause_interval = self.MIN_PAUSE_INTERVAL
if "batch" in params:
if "batch" in params:
usage = 0
pause_interval = self.MIN_PAUSE_INTERVAL
@alafanechere
Copy link
Contributor

alafanechere commented Mar 16, 2022

/test connector=connectors/source-facebook-marketing

🕑 connectors/source-facebook-marketing https://github.com/airbytehq/airbyte/actions/runs/1993130973
✅ connectors/source-facebook-marketing https://github.com/airbytehq/airbyte/actions/runs/1993130973
Python tests coverage:

Name Stmts Miss Cover ------------------------------------------------------------------------ source_acceptance_test/utils/__init__.py 6 0 100% source_acceptance_test/tests/__init__.py 4 0 100% source_acceptance_test/__init__.py 2 0 100% source_acceptance_test/tests/test_full_refresh.py 52 2 96% source_acceptance_test/utils/asserts.py 37 2 95% source_acceptance_test/config.py 74 6 92% source_acceptance_test/utils/json_schema_helper.py 105 13 88% source_acceptance_test/utils/common.py 70 17 76% source_acceptance_test/utils/compare.py 62 23 63% source_acceptance_test/tests/test_core.py 275 106 61% source_acceptance_test/base.py 10 4 60% source_acceptance_test/utils/connector_runner.py 110 48 56% source_acceptance_test/tests/test_incremental.py 69 38 45% ------------------------------------------------------------------------ TOTAL 876 259 70% Name Stmts Miss Cover ------------------------------------------------------------------------------- source_facebook_marketing/streams/__init__.py 2 0 100% source_facebook_marketing/spec.py 33 0 100% source_facebook_marketing/__init__.py 2 0 100% source_facebook_marketing/api.py 97 13 87% source_facebook_marketing/streams/base_streams.py 127 27 79% source_facebook_marketing/streams/common.py 41 13 68% source_facebook_marketing/streams/streams.py 97 32 67% source_facebook_marketing/source.py 35 13 63% source_facebook_marketing/streams/base_insight_streams.py 129 54 58% source_facebook_marketing/streams/async_job.py 204 128 37% source_facebook_marketing/streams/async_job_manager.py 74 56 24% ------------------------------------------------------------------------------- TOTAL 841 336 60% Name Stmts Miss Cover ------------------------------------------------------------------------------- source_facebook_marketing/streams/async_job.py 204 0 100% source_facebook_marketing/streams/__init__.py 2 0 100% source_facebook_marketing/spec.py 33 0 100% source_facebook_marketing/source.py 35 0 100% source_facebook_marketing/__init__.py 2 0 100% source_facebook_marketing/streams/common.py 41 1 98% source_facebook_marketing/streams/async_job_manager.py 74 3 96% source_facebook_marketing/streams/base_insight_streams.py 129 13 90% source_facebook_marketing/api.py 97 16 84% source_facebook_marketing/streams/streams.py 97 22 77% source_facebook_marketing/streams/base_streams.py 127 30 76% ------------------------------------------------------------------------------- TOTAL 841 85 90% 
@alafanechere
Copy link
Contributor

I'm requesting reviews from @sherifnada and @misteryeo because this connector is targeted for GA.

@marcosmarxm
Copy link
Contributor

just a comment, this change solves the problem. I saw some clients getting their sync finished because of 100% quota limit and not respecting the correct backoff time. The only drawback is: increase in time to finish the sync.

thanks for this contribution @vladimir-remar

Copy link
Contributor

@sherifnada sherifnada left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vladimir-remar could you add unit tests for this functionality please?

Copy link
Contributor

@misteryeo misteryeo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@alafanechere alafanechere left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @vladimir-remar I wrote some unit tests for the functions you implemented.
Feel free to review these tests and add more cases if needed.
I have a remaining question about the _compute_pause_interval function that I added in the review comment.

"""The sleep time will be calculated based on usage consumed."""
if usage >= self.MAX_RATE:
return max(self.MAX_PAUSE_INTERVAL, pause_interval)
return self.MIN_PAUSE_INTERVAL
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
return self.MIN_PAUSE_INTERVAL
return max(self.MIN_PAUSE_INTERVAL, pause_interval)

If the usage is lower than the max rate but the pause interval is higher than the default minimum one: don't we want to be conservative and pause for the time the API requests?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the usage is lower than the max rate but the pause interval is higher than the default minimum one: don't we want to be conservative and pause for the time the API requests?

for sure, the idea is to pause the requests taking into account the time interval that the response returns.

@alafanechere
Copy link
Contributor

alafanechere commented Mar 21, 2022

/test connector=connectors/source-facebook-marketing

🕑 connectors/source-facebook-marketing https://github.com/airbytehq/airbyte/actions/runs/2015949287
✅ connectors/source-facebook-marketing https://github.com/airbytehq/airbyte/actions/runs/2015949287
Python tests coverage:

Name Stmts Miss Cover ------------------------------------------------------------------------ source_acceptance_test/utils/__init__.py 6 0 100% source_acceptance_test/tests/__init__.py 4 0 100% source_acceptance_test/__init__.py 2 0 100% source_acceptance_test/tests/test_full_refresh.py 52 2 96% source_acceptance_test/utils/asserts.py 37 2 95% source_acceptance_test/config.py 74 6 92% source_acceptance_test/utils/json_schema_helper.py 105 13 88% source_acceptance_test/utils/common.py 70 17 76% source_acceptance_test/utils/compare.py 62 23 63% source_acceptance_test/tests/test_core.py 275 106 61% source_acceptance_test/base.py 10 4 60% source_acceptance_test/utils/connector_runner.py 110 48 56% source_acceptance_test/tests/test_incremental.py 69 38 45% ------------------------------------------------------------------------ TOTAL 876 259 70% Name Stmts Miss Cover ------------------------------------------------------------------------------- source_facebook_marketing/streams/__init__.py 2 0 100% source_facebook_marketing/spec.py 33 0 100% source_facebook_marketing/__init__.py 2 0 100% source_facebook_marketing/api.py 101 13 87% source_facebook_marketing/streams/base_streams.py 127 27 79% source_facebook_marketing/streams/common.py 41 13 68% source_facebook_marketing/streams/streams.py 97 32 67% source_facebook_marketing/source.py 35 13 63% source_facebook_marketing/streams/base_insight_streams.py 129 54 58% source_facebook_marketing/streams/async_job.py 204 128 37% source_facebook_marketing/streams/async_job_manager.py 74 56 24% ------------------------------------------------------------------------------- TOTAL 845 336 60% Name Stmts Miss Cover ------------------------------------------------------------------------------- source_facebook_marketing/streams/async_job.py 204 0 100% source_facebook_marketing/streams/__init__.py 2 0 100% source_facebook_marketing/spec.py 33 0 100% source_facebook_marketing/source.py 35 0 100% source_facebook_marketing/__init__.py 2 0 100% source_facebook_marketing/streams/common.py 41 1 98% source_facebook_marketing/streams/async_job_manager.py 74 3 96% source_facebook_marketing/api.py 101 10 90% source_facebook_marketing/streams/base_insight_streams.py 129 13 90% source_facebook_marketing/streams/streams.py 97 22 77% source_facebook_marketing/streams/base_streams.py 127 30 76% ------------------------------------------------------------------------------- TOTAL 845 79 91% 
@vladimir-remar
Copy link
Contributor Author

Hey @vladimir-remar I wrote some unit tests for the functions you implemented. Feel free to review these tests and add more cases if needed. I have a remaining question about the _compute_pause_interval function that I added in the review comment.

@alafanechere thanks for add the unitests

@vladimir-remar
Copy link
Contributor Author

vladimir-remar commented Mar 21, 2022

@vladimir-remar could you add unit tests for this functionality please?

@sherifnada thanks to @alafanechere who already did.

@alafanechere
Copy link
Contributor

alafanechere commented Mar 21, 2022

/test connector=connectors/source-facebook-marketing

🕑 connectors/source-facebook-marketing https://github.com/airbytehq/airbyte/actions/runs/2018132032
✅ connectors/source-facebook-marketing https://github.com/airbytehq/airbyte/actions/runs/2018132032
Python tests coverage:

Name Stmts Miss Cover ------------------------------------------------------------------------ source_acceptance_test/utils/__init__.py 6 0 100% source_acceptance_test/tests/__init__.py 4 0 100% source_acceptance_test/__init__.py 2 0 100% source_acceptance_test/tests/test_full_refresh.py 52 2 96% source_acceptance_test/utils/asserts.py 37 2 95% source_acceptance_test/config.py 74 6 92% source_acceptance_test/utils/json_schema_helper.py 105 13 88% source_acceptance_test/utils/common.py 70 17 76% source_acceptance_test/utils/compare.py 62 23 63% source_acceptance_test/tests/test_core.py 275 106 61% source_acceptance_test/base.py 10 4 60% source_acceptance_test/utils/connector_runner.py 110 48 56% source_acceptance_test/tests/test_incremental.py 69 38 45% ------------------------------------------------------------------------ TOTAL 876 259 70% Name Stmts Miss Cover ------------------------------------------------------------------------------- source_facebook_marketing/streams/__init__.py 2 0 100% source_facebook_marketing/spec.py 33 0 100% source_facebook_marketing/__init__.py 2 0 100% source_facebook_marketing/api.py 101 13 87% source_facebook_marketing/streams/base_streams.py 127 27 79% source_facebook_marketing/streams/common.py 41 13 68% source_facebook_marketing/streams/streams.py 97 32 67% source_facebook_marketing/source.py 35 13 63% source_facebook_marketing/streams/base_insight_streams.py 129 54 58% source_facebook_marketing/streams/async_job.py 204 128 37% source_facebook_marketing/streams/async_job_manager.py 74 56 24% ------------------------------------------------------------------------------- TOTAL 845 336 60% Name Stmts Miss Cover ------------------------------------------------------------------------------- source_facebook_marketing/streams/async_job.py 204 0 100% source_facebook_marketing/streams/__init__.py 2 0 100% source_facebook_marketing/spec.py 33 0 100% source_facebook_marketing/source.py 35 0 100% source_facebook_marketing/__init__.py 2 0 100% source_facebook_marketing/streams/common.py 41 1 98% source_facebook_marketing/streams/async_job_manager.py 74 3 96% source_facebook_marketing/api.py 101 10 90% source_facebook_marketing/streams/base_insight_streams.py 129 13 90% source_facebook_marketing/streams/streams.py 97 22 77% source_facebook_marketing/streams/base_streams.py 127 30 76% ------------------------------------------------------------------------------- TOTAL 845 79 91% 
Copy link
Contributor

@alafanechere alafanechere left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your quick reply @vladimir-remar ! I'd love a final review from our connector team.

@alafanechere
Copy link
Contributor

alafanechere commented Mar 25, 2022

/publish connector=connectors/source-facebook-marketing

🕑 connectors/source-facebook-marketing https://github.com/airbytehq/airbyte/actions/runs/2038827353
✅ connectors/source-facebook-marketing https://github.com/airbytehq/airbyte/actions/runs/2038827353

@alafanechere alafanechere merged commit c51ea6e into airbytehq:master Mar 25, 2022
@alafanechere
Copy link
Contributor

Thank you for your contribution and patience @vladimir-remar !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/connectors Connector related issues area/documentation Improvements or additions to documentation community

6 participants