- Notifications
You must be signed in to change notification settings - Fork 46
Closed
Description
Describe the bug
Commonly used Linux command line JQ tool doesn't work with JSON which prepared by sdp, because jq expect that wrapper is double quote.
Example of real GCP BigQuery DDL:
DDL:
$ cat sample_ddl.sql CREATE TABLE `dataset.table` ( Updated_Date DATE NOT NULL OPTIONS (description =Description1), Lead_Store_Id INT64 NOT NULL OPTIONS (description =Description2), ) PARTITION BY LeadCreatedDate CLUSTER BY Lead_Store_Id OUTPUT:
$ sdp -o bigquery --no-dump sample_ddl.sql Start parsing file sample_ddl.sql File with result was saved to >> schemas folder [{'alter': {}, 'checks': [], 'cluster_by': ['Lead_Store_Id'], 'columns': [{'check': None, 'default': None, 'name': 'Updated_Date', 'nullable': False, 'options': [{'description': '=Description1'}], 'references': None, 'size': None, 'type': 'DATE', 'unique': False}, {'check': None, 'default': None, 'name': 'Lead_Store_Id', 'nullable': False, 'options': [{'description': '=Description2'}], 'references': None, 'size': None, 'type': 'INT64', 'unique': False}], 'constraints': {'checks': None, 'references': None, 'uniques': None}, 'index': [], 'partition_by': {'columns': ['LeadCreatedDate'], 'type': None}, 'partitioned_by': [], 'primary_key': [], 'schema': '`dataset', 'table_name': 'table`', 'tablespace': None}] JQ run:
$ sdp -o bigquery --no-dump sample_ddl.sql | egrep -v 'Start parsing file |File with result was saved to ' | jq . parse error: Invalid numeric literal at line 2, column 10 Followed JSON is OK:
[{"alter": {}, "checks": [], "cluster_by": ["Lead_Store_Id"], "columns": [{"check": "None", "default": "None", "name": "Updated_Date", "nullable": "False", "options": [{"description": "=Description1"}], "references": "None", "size": "None", "type": "DATE", "unique": "False"}, {"check": "None", "default": "None", "name": "Lead_Store_Id", "nullable": "False", "options": [{"description": "=Description2"}], "references": "None", "size": "None", "type": "INT64", "unique": "False"}], "dataset": "`dataset", "index": [], "partition_by": {"columns": ["LeadCreatedDate"], "type": "None"}, "partitioned_by": [], "primary_key": [], "table_name": "table`", "tablespace": "None"}] Here is few issues:
-
JQ expect that all values wrapped (None, False, True - as well)
-
JQ expect that wrapper will be double quote, not single quote.
JSON validator https://jsonformatter.curiousconcept.com/ said that points 1 and 2 are RFC violation. -
Incorrect parsing if table name wrapped by back quote: "CREATE TABLE `dataset.table`". As you can see from output, parser keeped back quote at the beginning (but sometimes at the ending) of the table name:
'schema': '`dataset', Metadata
Metadata
Assignees
Labels
No labels