- Notifications
You must be signed in to change notification settings - Fork 4.9k
🐛 normalization for bigquery: allow datasetId and table to start with number #9341
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
🐛 normalization for bigquery: allow datasetId and table to start with number #9341
Conversation
| /test connector=bases/base-normalization
|
| /test connector=bases/base-normalization
|
bazarnov left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some other normalisation tests are not passing. Is it related to this particular change you're making?
https://github.com/airbytehq/airbyte/runs/4774907109?check_suite_focus=true#step:10:20913
airbyte-integrations/bases/base-normalization/unit_tests/test_destination_name_transformer.py Show resolved Hide resolved
…ination-bigquery-normalization
| /test connector=bases/base-normalization
|
| /test connector=bases/base-normalization
|
E AssertionError: docker run --rm --init -v /actions-runner/_work/airbyte/airbyte/airbyte-integrations/bases/base-normalization/integration_tests/normalization_test_output/redshift/test_nested_streams:/workspace -v /actions-runner/_work/airbyte/airbyte/airbyte-integrations/bases/base-normalization/integration_tests/normalization_test_output/redshift/test_nested_streams/build:/build -v /actions-runner/_work/airbyte/airbyte/airbyte-integrations/bases/base-normalization/integration_tests/normalization_test_output/redshift/test_nested_streams/logs:/logs -v /tmp:/tmp --network host --entrypoint /usr/local/bin/dbt -i airbyte/normalization:dev run --profiles-dir=/workspace --project-dir=/workspace --full-refresh /actions-runner/_work/airbyte/airbyte/airbyte-integrations/bases/base-normalization/integration_tests/dbt_integration_test.py:465: AssertionError
|
replied above |
antixar left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is minor changes but tests were failed. I see their failures are related with another reason but we need approval of Airbyte team for publishing of your changes without tests.
| The official docs from Google BigQuery says:
https://cloud.google.com/bigquery/docs/reference/standard-sql/lexical That's why it was implemented to insert an extra underscore Why do we need to diverge from official recommendations? or do we need to adapt the logic for dataset and not table/column names? |
The normalization tests for redshift are failing indeed. Can we look into why redshift is failing too? Was there any changes related to json/super types in redshift? |
…ination-bigquery-normalization
| /test connector=bases/base-normalization
|
| /test connector=bases/base-normalization
|
| /test connector=bases/base-normalization
|
This problem was caused by overlapping tests from different branches. Once we re-created redshift database my tests started working. |
Can not start with number: column (added _ before name)
I have tested different names on Bigquery: Can not start with number: column (GCP also automatically adds _ to column names which starts with number during CSV upload) @ChristopheDuong I tested only datasetId, table, columns ... is there another identifier that we should consider for big query? |
ChristopheDuong left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great find thanks
| /publish connector=bases/base-normalization
|
What
If dataset_if started with a number like '1id' then normalization transformed it to '_1id'
REAL Example:
select
id,
name,
type,
enabled,
updatedAt,
insertedAt,
personaTagIds,
contactListIds,
_airbyte_emitted_at,
_airbyte_workflows_hashid
from __dbt__CTE__workflows_ab3
-- workflows from
dataline-integration-testing._111testing_raw._airbyte_raw_workflowsHow
Removed conditions in
airbyte-integrations/bases/base-normalization/normalization/transform_catalog/destination_name_transformer.py
Manually tested that Bigquery allows the creation of dataset which starts with a number,
Table name also can start with a number
Recommended reading order
x.javay.python🚨 User Impact 🚨
No
Pre-merge Checklist
Expand the relevant checklist and delete the others.
New Connector
Community member or Airbyter
airbyte_secret./gradlew :airbyte-integrations:connectors:<name>:integrationTest.README.mdbootstrap.md. See description and examplesdocs/SUMMARY.mddocs/integrations/<source or destination>/<name>.mdincluding changelog. See changelog exampledocs/integrations/README.mdairbyte-integrations/builds.mdAirbyter
If this is a community PR, the Airbyte engineer reviewing this PR is responsible for the below items.
/test connector=connectors/<name>command is passing./publishcommand described hereUpdating a connector
Community member or Airbyter
airbyte_secret./gradlew :airbyte-integrations:connectors:<name>:integrationTest.README.mdbootstrap.md. See description and examplesdocs/integrations/<source or destination>/<name>.mdincluding changelog. See changelog exampleAirbyter
If this is a community PR, the Airbyte engineer reviewing this PR is responsible for the below items.
/test connector=connectors/<name>command is passing./publishcommand described hereConnector Generator
-scaffoldin their name) have been updated with the latest scaffold by running./gradlew :airbyte-integrations:connector-templates:generator:testScaffoldTemplatesthen checking in your changesThis change is