JSON Format Settings

Search Shortcut cmd + k | ctrl + k

Documentation / Data Import and Export / JSON Files

The JSON extension can attempt to determine the format of a JSON file when setting format to auto. Here are some example JSON files and the corresponding format settings that should be used.

In each of the below cases, the format setting was not needed, as DuckDB was able to infer it correctly, but it is included for illustrative purposes. A query of this shape would work in each case:

SELECT * FROM filename.json; 

Format: `newline_delimited`

With format = 'newline_delimited' newline-delimited JSON can be parsed. Each line is a JSON.

We use the example file records.json with the following content:

{"key1":"value1", "key2": "value1"} {"key1":"value2", "key2": "value2"} {"key1":"value3", "key2": "value3"} 

SELECT * FROM read_json('records.json', format = 'newline_delimited'); 

key1	key2
value1	value1
value2	value2
value3	value3

Format: `array`

If the JSON file contains a JSON array of objects (pretty-printed or not), array_of_objects may be used. To demonstrate its use, we use the example file records-in-array.json:

[ {"key1":"value1", "key2": "value1"}, {"key1":"value2", "key2": "value2"}, {"key1":"value3", "key2": "value3"} ] 

SELECT * FROM read_json('records-in-array.json', format = 'array'); 

key1	key2
value1	value1
value2	value2
value3	value3

Format: `unstructured`

If the JSON file contains JSON that is not newline-delimited or an array, unstructured may be used. To demonstrate its use, we use the example file unstructured.json:

{ "key1":"value1", "key2":"value1" } { "key1":"value2", "key2":"value2" } { "key1":"value3", "key2":"value3" } 

SELECT * FROM read_json('unstructured.json', format = 'unstructured'); 

key1	key2
value1	value1
value2	value2
value3	value3

The JSON extension can attempt to determine whether a JSON file contains records when setting records = auto. When records = true, the JSON extension expects JSON objects, and will unpack the fields of JSON objects into individual columns.

Continuing with the same example file, records.json:

{"key1":"value1", "key2": "value1"} {"key1":"value2", "key2": "value2"} {"key1":"value3", "key2": "value3"} 

SELECT * FROM read_json('records.json', records = true); 

key1	key2
value1	value1
value2	value2
value3	value3

When records = false, the JSON extension will not unpack the top-level objects, and create STRUCTs instead:

SELECT * FROM read_json('records.json', records = false); 

json
{'key1': value1, 'key2': value1}
{'key1': value2, 'key2': value2}
{'key1': value3, 'key2': value3}

This is especially useful if we have non-object JSON, for example, arrays.json:

[1, 2, 3] [4, 5, 6] [7, 8, 9] 

SELECT * FROM read_json('arrays.json', records = false); 

json
[1, 2, 3]
[4, 5, 6]
[7, 8, 9]

About this page

Code of Conduct Trademark Use

Format: `newline_delimited`

Format: `array`

Format: `unstructured`

Records Settings

About this page

In this article

Format: newline_delimited

Format: array

Format: unstructured

Records Settings

About this page

In this article

Format: `newline_delimited`

Format: `array`

Format: `unstructured`