Skip to content

Conversation

@davydov-d
Copy link
Contributor

What

https://github.com/airbytehq/oncall/issues/1681
Fix timing out check and discover commands

How

Read only the header of a file on check.
Read a single chunk of data on discover.

@davydov-d
Copy link
Contributor Author

davydov-d commented Mar 21, 2023

/test connector=connectors/source-file

🕑 connectors/source-file https://github.com/airbytehq/airbyte/actions/runs/4478139235
✅ connectors/source-file https://github.com/airbytehq/airbyte/actions/runs/4478139235
Python tests coverage:

Name Stmts Miss Cover --------------------------------------------- source_file/__init__.py 2 0 100% source_file/utils.py 13 1 92% source_file/source.py 81 7 91% source_file/client.py 288 37 87% --------------------------------------------- TOTAL 384 45 88% Name Stmts Miss Cover --------------------------------------------- source_file/__init__.py 2 0 100% source_file/client.py 288 43 85% source_file/utils.py 13 8 38% source_file/source.py 81 60 26% --------------------------------------------- TOTAL 384 111 71% 

Build Passed

Test summary info:

=========================== short test summary info ============================ SKIPPED [1] ../usr/local/lib/python3.9/site-packages/connector_acceptance_test/plugin.py:63: Skipping TestIncremental.test_two_sequential_reads: Incremental syncs are not supported on this connector. SKIPPED [1] ../usr/local/lib/python3.9/site-packages/connector_acceptance_test/tests/test_core.py:100: The previous and actual specifications are identical. SKIPPED [1] ../usr/local/lib/python3.9/site-packages/connector_acceptance_test/tests/test_core.py:509: The previous and actual discovered catalogs are identical. ======================== 32 passed, 3 skipped in 48.86s ======================== 
@davydov-d davydov-d requested review from bnchrch and lazebnyi March 21, 2023 10:47
# this is to ensure we make all conditions under which the bug is reproduced, i.e.
# - chunk size < file size
# - column type in the last chunk is not `string`
@patch("source_file.client.Client.CSV_CHUNK_SIZE", 1)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dropped this test because it doesn't represent the expected behavior from now on

with client.reader.open():
list(client.streams)
return AirbyteConnectionStatus(status=Status.SUCCEEDED)
list(client.streams(empty_schema=True))
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Read only file header when running check to ensure the connection succeeds

if skip_data:
reader_options["nrows"] = 0
reader_options["index_col"] = 0
yield from reader(fp, **reader_options)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Read only self.CSV_CHUNK_SIZE bytes of data to generate schema. Otherwise a time out is possible in case a large file is read

@davydov-d
Copy link
Contributor Author

davydov-d commented Mar 21, 2023

/publish connector=connectors/source-file

🕑 Publishing the following connectors:
connectors/source-file
https://github.com/airbytehq/airbyte/actions/runs/4483199836


Connector Did it publish? Were definitions generated?
connectors/source-file

if you have connectors that successfully published but failed definition generation, follow step 4 here ▶️

@davydov-d
Copy link
Contributor Author

davydov-d commented Mar 21, 2023

/publish connector=connectors/source-file-secure

🕑 Publishing the following connectors:
connectors/source-file-secure
https://github.com/airbytehq/airbyte/actions/runs/4483366192


Connector Did it publish? Were definitions generated?
connectors/source-file-secure

if you have connectors that successfully published but failed definition generation, follow step 4 here ▶️

@davydov-d davydov-d merged commit 9fe07dc into master Mar 22, 2023
@davydov-d davydov-d deleted the ddavydov/#1681-source-file-fix-check branch March 22, 2023 08:06
erohmensing pushed a commit that referenced this pull request Mar 22, 2023
* #1681 source file: do not read whole file on check and discover * #1681 source file: upd changelog * #1681 source file: add allowed hosts * auto-bump connector version --------- Co-authored-by: Octavia Squidington III <octavia-squidington-iii@users.noreply.github.com>
erohmensing pushed a commit that referenced this pull request Mar 22, 2023
* #1681 source file: do not read whole file on check and discover * #1681 source file: upd changelog * #1681 source file: add allowed hosts * auto-bump connector version --------- Co-authored-by: Octavia Squidington III <octavia-squidington-iii@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/connectors Connector related issues area/documentation Improvements or additions to documentation connectors/source/file connectors/source/file-secure

4 participants