Skip to content

Conversation

@vitaliizazmic
Copy link
Contributor

@vitaliizazmic vitaliizazmic commented Jan 21, 2021

How

  • Create file samples
  • Check uploading each of them from Local storage
  • Remove html, orc and pickle from available formats in spec.json
  • Create share test case for all formats to pull file

Pre-merge Checklist

  • Run integration tests
  • Publish Docker images
@vitaliizazmic
Copy link
Contributor Author

vitaliizazmic commented Jan 21, 2021

@sherifnada
Copy link
Contributor

@vitaliizazmic I added custom integration tests to CI -- can you make sure those tests are succeeding? you can also run them locally by running python -m pytest -s integration_tests from the module directory with the venv activated, will be much faster for iteration

@sherifnada
Copy link
Contributor

sherifnada commented Jan 27, 2021

@vitaliizazmic could you pull in @eugene-kulak 's changes from #1712 and ping me when this is ready for review? All the changes he needs to make are stylistic so I think you should be able to build on top of it. One thing to keep in mind is that you should probably put your integration tests in a separate test file for easier reading.

# Conflicts: #	airbyte-integrations/connectors/source-file/build.gradle #	airbyte-integrations/connectors/source-file/integration_tests/integration_source_test.py #	airbyte-integrations/connectors/source-file/setup.py #	docs/integrations/sources/file.md
@sherifnada
Copy link
Contributor

sherifnada commented Jan 29, 2021

Copy link
Contributor

@sherifnada sherifnada left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

small comments, will merge tomorrow

| Format | Supported? |
| :--- | :--- |
| CSV | Yes |
| JSON | Experimental |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is this one still listed as experimental?

@@ -0,0 +1,21 @@
{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if we don't support parquet/orc/pickle can we remove them and any catalogs/sample files that are not supported?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sherifnada we support parquet and pickle

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pickle works if it contains DataFrame. Because read_pickle() method returns same type as object stored in file. And discover and read methods of client expect for DataFrame.

"smart-open[all]==4.1.2",
"lxml==4.6.2",
"html5lib==1.1",
"beautifulsoup4==4.9.3",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are all of these libs required? html/BS/pyarrow?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. This libraries required for read different file formats.

@sherifnada
Copy link
Contributor

@vitaliizazmic once you've addressed the comments could you:

  1. pull master ( i merged eugene's bugfix for source-file)
  2. bump the connector version
  3. run the test command

I'll publish and merge. Thanks!

| CSV | Yes |
| JSON | Experimental |
| HTML | Untested |
| JSON | Yes |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I originally put it as 'Experimental' because Source-File couldn't handle nested JSON files
(flat JSON file worked)

@sherifnada
Copy link
Contributor

sherifnada commented Feb 1, 2021

/publish connector=connectors=source-file

Error: Workflow does not have 'workflow_dispatch' trigger

@sherifnada
Copy link
Contributor

sherifnada commented Feb 1, 2021

/publish connector=connectors=source-file

❌ connectors=source-file https://github.com/airbytehq/airbyte/actions/runs/529124974

@sherifnada
Copy link
Contributor

sherifnada commented Feb 1, 2021

/publish connector=connectors/source-file

🕑 connectors/source-file https://github.com/airbytehq/airbyte/actions/runs/529143991
❌ connectors/source-file https://github.com/airbytehq/airbyte/actions/runs/529143991

@sherifnada
Copy link
Contributor

@vitaliizazmic tests are failing, can you take a look ?

@vitaliizazmic
Copy link
Contributor Author

vitaliizazmic commented Feb 2, 2021

@sherifnada
Copy link
Contributor

sherifnada commented Feb 2, 2021

/publish connector=connectors/source-file

🕑 connectors/source-file https://github.com/airbytehq/airbyte/actions/runs/532163094
✅ connectors/source-file https://github.com/airbytehq/airbyte/actions/runs/532163094

@sherifnada sherifnada merged commit caf673d into master Feb 2, 2021
@sherifnada sherifnada deleted the vitalii/1392_test_file_formats branch February 2, 2021 22:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

5 participants