Skip to content

Conversation

@augan-rymkhan
Copy link
Contributor

@augan-rymkhan augan-rymkhan commented Dec 6, 2021

What

Resolves #8285
After long sleep the connector tries to send request by resetting the connection, then http.client.RemoteDisconnected error is raised, because Github closes the connection without any response.

The issue is descibed here

send: b'GET /repos/airbytehq/airbyte/branches?per_page=100&page=2 HTTP/1.1\r\nHost: api.github.com\r\nUser-Agent: PostmanRuntime/7.28.0\r\nAccept-Encoding: gzip, deflate\r\nAccept: */*\r\nConnection: keep-alive\r\nAuthorization: token \r\n\r\n' reply: '' {"type": "LOG", "log": {"level": "INFO", "message": "Backing off _send(...) for 5.0s (requests.exceptions.ConnectionError: ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response')))"}} {"type": "LOG", "log": {"level": "INFO", "message": "Caught retryable error '('Connection aborted.', RemoteDisconnected('Remote end closed connection without response'))' after 1 tries. Waiting 5 seconds then retrying..."}} Starting new HTTPS connection (27): api.github.com:443 {"type": "LOG", "log": {"level": "DEBUG", "message": "Starting new HTTPS connection (27): api.github.com:443"}} send: b'GET /repos/airbytehq/airbyte/branches?per_page=100&page=2 HTTP/1.1\r\nHost: api.github.com\r\nUser-Agent: PostmanRuntime/7.28.0\r\nAccept-Encoding: gzip, deflate\r\nAccept: */*\r\nConnection: keep-alive\r\nAuthorization: token \r\n\r\n' reply: 'HTTP/1.1 200 OK\r\n' header: Server: GitHub.com header: Date: Thu, 02 Dec 2021 15:38:54 GMT header: Content-Type: application/json; charset=utf-8 header: Transfer-Encoding: chunked header: Cache-Control: private, max-age=60, s-maxage=60 header: Vary: Accept, Authorization, Cookie, X-GitHub-OTP header: ETag: W/"5c8336abd46d64c4409a1af335e4ecdf5eef3686895bb34dd76da7873d8ca9c0" header: X-OAuth-Scopes: admin:org, repo header: X-Accepted-OAuth-Scopes: 
{"type": "LOG", "log": {"level": "INFO", "message": "Backing off _send(...) for 0.0s (airbyte_cdk.sources.streams.http.exceptions.UserDefinedBackoffException)"}} {"type": "LOG", "log": {"level": "INFO", "message": "Retrying. Sleeping for 877.2983934879303 seconds"}} {"type": "LOG", "log": {"level": "INFO", "message": "Backing off _send(...) for 0.0s (airbyte_cdk.sources.streams.http.exceptions.UserDefinedBackoffException)"}} {"type": "LOG", "log": {"level": "INFO", "message": "Retrying. Sleeping for 60 seconds"}} {"type": "LOG", "log": {"level": "INFO", "message": "Backing off _send(...) for 5.0s (requests.exceptions.ConnectionError: ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response')))"}} {"type": "LOG", "log": {"level": "INFO", "message": "Caught retryable error '('Connection aborted.', RemoteDisconnected('Remote end closed connection without response'))' after 1 tries. Waiting 5 seconds then retrying..."}} {"type": "RECORD", "record": {"stream": "issue_comment_reactions", "data": {"id": 127835852, "node_id": "MDg6UmVhY3Rpb24xMjc4MzU4NTI=", "user": {"login": "jrhizor", "id": 120362, "node_id": "MDQ6VXNlcjEyMDM2Mg==", "avatar_url": "https://avatars.githubusercontent.com/u/120362?v=4", "gravatar_id": "", "url": "https://api.github.com/users/jrhizor", "html_url": "https://github.com/jrhizor", "followers_url": "https://api.github.com/users/jrhizor/followers", "following_url": "https://api.github.com/users/jrhizor/following{/other_user}", "gists_url": "https://api.github.com/users/jrhizor/gists{/gist_id}", "starred_url": "https://api.github.com/users/jrhizor/starred{/owner}{/repo}", "subscriptions_url": "https://api.github.com/users/jrhizor/subscriptions", "organizations_url": "https://api.github.com/users/jrhizor/orgs", "repos_url": "https://api.github.com/users/jrhizor/repos", "events_url": "https://api.github.com/users/jrhizor/events{/privacy}", "received_events_url": "https://api.github.com/users/jrhizor/received_events", "type": "User", "site_admin": false}, "content": "+1", "created_at": "2021-09-10T22:40:38Z", "repository": "airbytehq/airbyte"}, "emitted_at": 1638540853342}} 

Raised requests.exceptions.ConnectionError is handled by Default backoff handler, then we sleep 5 seconds.

How

We can configure HTTPAdapter as described here to retry if connection is closed:

In GithubStream class's init method:

super().__init__(**kwargs) ... MAX_RETRIES = 3 adapter = requests.adapters.HTTPAdapter(max_retries=MAX_RETRIES) self._session.mount('https://', adapter) self._session.mount('http://', adapter) 

max_retries(from the urllib3 doc) The maximum number of retries each connection should attempt. Note, this applies only to failed DNS lookups, socket connections and connection timeouts,
never to requests where data has made it to the server. By default, Requests does not retry failed connections.

After connection-related errors it retries to connect again.
Logs after proposed solution

{"type": "LOG", "log": {"level": "INFO", "message": "Backing off _send(...) for 0.0s (airbyte_cdk.sources.streams.http.exceptions.UserDefinedBackoffException)"}} {"type": "LOG", "log": {"level": "INFO", "message": "Retrying. Sleeping for 877.8840591907501 seconds"}} {"type": "LOG", "log": {"level": "INFO", "message": "Backing off _send(...) for 0.0s (airbyte_cdk.sources.streams.http.exceptions.UserDefinedBackoffException)"}} {"type": "LOG", "log": {"level": "INFO", "message": "Retrying. Sleeping for 60 seconds"}} {"type": "LOG", "log": {"level": "WARN", "message": "Retrying (Retry(total=2, connect=None, read=None, redirect=None, status=None)) after connection broken by 'ProtocolError('Connection aborted.', RemoteDisconnected('Remote end closed connection without response'))': /repos/airbytehq/airbyte/issues/comments/941290749/reactions?per_page=100"}} {"type": "RECORD", "record": {"stream": "issue_comment_reactions", "data": {"id": 132269456, "node_id": "REA_lALOEN7yYc44GvT9zgfiRZA", "user": {"login": "jrhizor", "id": 120362, "node_id": "MDQ6VXNlcjEyMDM2Mg==", "avatar_url": "https://avatars.githubusercontent.com/u/120362?v=4", "gravatar_id": "", "url": "https://api.github.com/users/jrhizor", "html_url": "https://github.com/jrhizor", "followers_url": "https://api.github.com/users/jrhizor/followers", "following_url": "https://api.github.com/users/jrhizor/following{/other_user}", "gists_url": "https://api.github.com/users/jrhizor/gists{/gist_id}", "starred_url": "https://api.github.com/users/jrhizor/starred{/owner}{/repo}", "subscriptions_url": "https://api.github.com/users/jrhizor/subscriptions", "organizations_url": "https://api.github.com/users/jrhizor/orgs", "repos_url": "https://api.github.com/users/jrhizor/repos", "events_url": "https://api.github.com/users/jrhizor/events{/privacy}", "received_events_url": "https://api.github.com/users/jrhizor/received_events", "type": "User", "site_admin": false}, "content": "+1", "created_at": "2021-10-13T15:18:05Z", "repository": "airbytehq/airbyte"}, "emitted_at": 1638540849653}} 

Retries should succeed to establish a new connection.

ConnectionError will not be propagated to the default backoff handler, so we can continue without sleep there.

Recommended reading order

source_github/streams.py

@github-actions github-actions bot added the area/connectors Connector related issues label Dec 6, 2021
@augan-rymkhan augan-rymkhan temporarily deployed to more-secrets December 6, 2021 05:44 Inactive
@github-actions github-actions bot added the area/documentation Improvements or additions to documentation label Dec 6, 2021
@augan-rymkhan augan-rymkhan temporarily deployed to more-secrets December 6, 2021 05:47 Inactive
@augan-rymkhan
Copy link
Contributor Author

augan-rymkhan commented Dec 6, 2021

/test connector=connectors/source-github

🕑 connectors/source-github https://github.com/airbytehq/airbyte/actions/runs/1543308604
❌ connectors/source-github https://github.com/airbytehq/airbyte/actions/runs/1543308604
🐛 https://gradle.com/s/ft2arkn56y55e

@jrhizor jrhizor temporarily deployed to more-secrets December 6, 2021 06:03 Inactive
@augan-rymkhan
Copy link
Contributor Author

augan-rymkhan commented Dec 6, 2021

/test connector=connectors/source-github

🕑 connectors/source-github https://github.com/airbytehq/airbyte/actions/runs/1543846169
❌ connectors/source-github https://github.com/airbytehq/airbyte/actions/runs/1543846169
🐛 https://gradle.com/s/2g7ynudrvwd4u

@jrhizor jrhizor temporarily deployed to more-secrets December 6, 2021 09:05 Inactive
@augan-rymkhan
Copy link
Contributor Author

augan-rymkhan commented Dec 6, 2021

/test connector=connectors/source-github

🕑 connectors/source-github https://github.com/airbytehq/airbyte/actions/runs/1543908864
❌ connectors/source-github https://github.com/airbytehq/airbyte/actions/runs/1543908864
🐛 https://gradle.com/s/vkl2b5brqa2os

@jrhizor jrhizor temporarily deployed to more-secrets December 6, 2021 09:20 Inactive
@augan-rymkhan
Copy link
Contributor Author

augan-rymkhan commented Dec 6, 2021

/test connector=connectors/source-github

🕑 connectors/source-github https://github.com/airbytehq/airbyte/actions/runs/1544589691
❌ connectors/source-github https://github.com/airbytehq/airbyte/actions/runs/1544589691
🐛 https://gradle.com/s/plbn35kuaf7ay

@jrhizor jrhizor temporarily deployed to more-secrets December 6, 2021 12:29 Inactive
@augan-rymkhan
Copy link
Contributor Author

augan-rymkhan commented Dec 6, 2021

/test connector=connectors/source-github

🕑 connectors/source-github https://github.com/airbytehq/airbyte/actions/runs/1544926179
✅ connectors/source-github https://github.com/airbytehq/airbyte/actions/runs/1544926179
Python tests coverage:

 ---------- coverage: platform linux, python 3.8.10-final-0 ----------- Name Stmts Miss Cover ------------------------------------------------------------------------ source_acceptance_test/__init__.py 2 0 100% source_acceptance_test/base.py 10 4 60% source_acceptance_test/config.py 76 8 89% source_acceptance_test/conftest.py 109 109 0% source_acceptance_test/plugin.py 47 47 0% source_acceptance_test/tests/__init__.py 4 0 100% source_acceptance_test/tests/test_core.py 235 95 60% source_acceptance_test/tests/test_full_refresh.py 38 27 29% source_acceptance_test/tests/test_incremental.py 69 38 45% source_acceptance_test/utils/__init__.py 6 0 100% source_acceptance_test/utils/asserts.py 37 2 95% source_acceptance_test/utils/common.py 54 24 56% source_acceptance_test/utils/compare.py 62 25 60% source_acceptance_test/utils/connector_runner.py 82 49 40% source_acceptance_test/utils/json_schema_helper.py 115 14 88% ------------------------------------------------------------------------ TOTAL 946 442 53% ---------- coverage: platform linux, python 3.8.10-final-0 ----------- Name Stmts Miss Cover ----------------------------------------------- source_github/__init__.py 2 0 100% source_github/source.py 76 34 55% source_github/streams.py 367 176 52% ----------------------------------------------- TOTAL 445 210 53% 
@jrhizor jrhizor temporarily deployed to more-secrets December 6, 2021 13:59 Inactive
@augan-rymkhan augan-rymkhan changed the title retry connection with Github using HTTPAdapter Retry connection with Github using HTTPAdapter Dec 6, 2021
Copy link
Contributor

@keu keu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@keu keu changed the title Retry connection with Github using HTTPAdapter Source Github: Retry connection using HTTPAdapter Dec 6, 2021
@augan-rymkhan
Copy link
Contributor Author

augan-rymkhan commented Dec 8, 2021

/publish connector=connectors/source-github

🕑 connectors/source-github https://github.com/airbytehq/airbyte/actions/runs/1552715053
✅ connectors/source-github https://github.com/airbytehq/airbyte/actions/runs/1552715053

@jrhizor jrhizor temporarily deployed to more-secrets December 8, 2021 05:57 Inactive
@augan-rymkhan augan-rymkhan temporarily deployed to more-secrets December 8, 2021 06:09 Inactive
@augan-rymkhan augan-rymkhan merged commit 10f1b58 into master Dec 8, 2021
@augan-rymkhan augan-rymkhan deleted the arymkhan/fix-github-connection-closed-issue branch December 8, 2021 06:40
schlattk pushed a commit to schlattk/airbyte that referenced this pull request Jan 4, 2022
* retry connection with Github using HTTPAdapter * updated the connector version * updated source def and spec yaml Co-authored-by: Auganbay <auganenu@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/connectors Connector related issues area/documentation Improvements or additions to documentation connectors/source/github connectors/sources-api

9 participants