Skip to content

PostgREST/pg_csv

Repository files navigation

pg_csv

PostgreSQL version Coverage Status Tests

Postgres has CSV support on the COPY command, but COPY has problems:

  • It uses a special protocol, so it doesn't work with other standard features like prepared statements, pipeline mode or pgbench.
  • Is not composable. You can't use COPY inside CTEs, subqueries, view definitions or as function arguments.

pg_csv offers flexible CSV processing as a solution.

  • Includes a CSV aggregate that composes with SQL expressions.
  • Native C extension, x2 times faster than SQL queries that try to output CSV (see our CI results).
  • No dependencies except Postgres.

Installation

PostgreSQL >= 12 is supported. Clone this repo and run:

make && make install

To install the extension:

create extension pg_csv; 

csv_agg

Aggregate that builds a CSV respecting RFC 4180, quoting as required.

create table projects as select * from ( values (1, 'Death Star OS', 1), (2, 'Windows 95 Rebooted', 1), (3, 'Project "Comma,Please"', 2), (4, 'Escape ""Plan""', 2), (NULL, 'NULL & Void', NULL) ) as _(id, name, client_id);
select csv_agg(x) from projects x; csv_agg -------------------------------- id,name,client_id + 1,Death Star OS,1 + 2,Windows 95 Rebooted,1 + 3,"Project ""Comma,Please""",2+ 4,"Escape """"Plan""""",2 + ,NULL & Void, (1 row)

Custom Delimiter

Custom delimiters can be used to produce different formats like pipe-separated values, tab-separated values or semicolon-separated values.

select csv_agg(x, csv_options(delimiter := '|')) from projects x; csv_agg ----------------------------- id|name|client_id + 1|Death Star OS|1 + 2|Windows 95 Rebooted|1 + 3|Open Source Lightsabers|2+ 4|Galactic Payroll System|2+ 7|Bugzilla Revival|3 (1 row) select csv_agg(x, csv_options(delimiter := E'\t')) from projects x; csv_agg ----------------------------------- id name client_id + 1 Death Star OS 1 + 2 Windows 95 Rebooted 1+ 3 Open Source Lightsabers 2+ 4 Galactic Payroll System 2+ 7 Bugzilla Revival 3 (1 row)

Note

  • Newline, carriage return and double quotes are not supported as delimiters to maintain the integrity of the separated values format.
  • The delimiter can only be a single char, if a longer string is specified only the first char will be used.
  • Why use a csv_options constructor function instead of extra arguments? Aggregates don't support named arguments in postgres, see a discussion on #2 (comment).

BOM

You can include a byte-order mark (BOM) to make the CSV compatible with Excel.

select csv_agg(x, csv_options(bom := true)) from projects x; csv_agg ------------------- id,name,client_id+ 1,Death Star OS,1 2,Windows 95 Rebooted,1 3,Open Source Lightsabers,2 4,Galactic Payroll System,2 5,Bugzilla Revival,3 (1 row)

Header

You can omit or include the CSV header.

select csv_agg(x, csv_options(header := false)) from projects x; csv_agg ----------------------------- 1,Death Star OS,1 + 2,Windows 95 Rebooted,1 + 3,Open Source Lightsabers,2+ 4,Galactic Payroll System,2+ 7,Bugzilla Revival,3 (1 row)

Null string

NULL values are represented by an empty string by default. This can be changed with the nullstr option.

SELECT csv_agg(x, csv_options(nullstr:='<NULL>')) AS body FROM projects x; body -------------------------------- id,name,client_id + 1,Death Star OS,1 + 2,Windows 95 Rebooted,1 + 3,"Project ""Comma,Please""",2+ 4,"Escape """"Plan""""",2 + <NULL>,NULL & Void,<NULL> (1 row)

Limitations

  • For large bulk exports and imports, COPY ... CSV should still be preferred as its faster due to streaming support.