Skip to content

aws_s3 postgres extension to import/export data from/to s3 (compatible with aws_s3 extension on AWS RDS) (chimpler/postgres-aws-s3 is inactive)

License

Notifications You must be signed in to change notification settings

radondb/postgres-aws-s3

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

postgres-aws-s3

Starting on Postgres version 11.1, AWS RDS added support for S3 import using the extension aws_s3. It allows to import data from S3 within Postgres using the function aws_s3.table_import_from_s3.

In order to support development either on RDS or locally, we implemented our own aws_s3 extension that is similar to the one provided in RDS. It was implemented in Python using the boto3 library.

Installation

Make sure boto3 is installed using the default Python 2 installed on your computer. On MacOS, this can be done as follows:

sudo /usr/bin/easy_install boto3 

Then clone the repository postgres-aws-s3:

git clone git@github.com:chimpler/postgres-aws-s3 

Make sure that pg_config can be run:

$ pg_config BINDIR = /Applications/Postgres.app/Contents/Versions/11/bin DOCDIR = /Applications/Postgres.app/Contents/Versions/11/share/doc/postgresql HTMLDIR = /Applications/Postgres.app/Contents/Versions/11/share/doc/postgresql INCLUDEDIR = /Applications/Postgres.app/Contents/Versions/11/include PKGINCLUDEDIR = /Applications/Postgres.app/Contents/Versions/11/include/postgresql INCLUDEDIR-SERVER = /Applications/Postgres.app/Contents/Versions/11/include/postgresql/server LIBDIR = /Applications/Postgres.app/Contents/Versions/11/lib ... 

Then install postgres-aws-s3:

make install 

Finally in Postgres:

psql> CREATE EXTENSION plpythonu; psql> CREATE EXTENSION aws_s3; 

Using aws_s3

One can create a table that will import the data from S3. Let's create a table:

psql> CREATE TABLE animals ( name TEXT, age INT ); 

Let's suppose the following file is present in s3 at s3://my-bucket/samples/myfile.csv:

name,age dog,12 cat,15 parrot,103 tortoise,205 

The function aws_s3.table_import_from_s3 has 2 signatures that can be used.

Using s3_uri and aws_credentials objects

aws_s3.table_import_from_s3 ( table_name text, column_list text, options text, s3_info aws_commons._s3_uri_1, credentials aws_commons._aws_credentials_1 ) 

Using this signature, the s3_uri and aws_credentials objects will need to be created first:

Parameter Description
table_name the name of the table
column_list list of columns to copy
options options passed to the COPY command in Postgres
s3_info An aws_commons._s3_uri_1 composite type containing the bucket, file path and region information about the s3 object
credentials An aws_commons._aws_credentials_1 composite type containing the access key, secret key, session token credentials

Example

psql> SELECT aws_commons.create_s3_uri( 'my-bucket', 'samples/myfile.csv', 'us-east-1' ) AS s3_uri \gset psql> \echo :s3_uri (my-bucket,samples/myfile.csv,us-east-1) psql> SELECT aws_commons.create_aws_credentials( '<my_access_id>', '<my_secret_key>', '<session_token>' ) AS credentials \gset psql> \echo :credentials (<my_access_id>,<my_secret_key>,<session_token>) psql> SELECT aws_s3.table_import_from_s3( 'animals', '', '(FORMAT CSV, DELIMITER '','', HEADER true)', :'s3_uri', :'credentials' ) table_import_from_s3 ---------------------- 4 (1 row) psql> select * from animals; name | age ----------+----- dog | 12 cat | 15 parrot | 103 tortoise | 205 (4 rows) 

Using the function table_import_from_s3 with all the parameters

aws_s3.table_import_from_s3 ( table_name text, column_list text, options text, bucket text, file_path text, region text, access_key text, secret_key text, session_token text ) 
Parameter Description
table_name the name of the table
column_list list of columns to copy
options options passed to the COPY command in Postgres
bucket S3 bucket
file_path S3 path to the file
region S3 region (e.g., us-east-1)
s3_info An aws_commons._s3_uri_1 composite type containing the bucket, file path and region information about the s3 object
credentials An aws_commons._aws_credentials_1 composite type containing the access key, secret key, session token credentials

Example

psql> SELECT aws_s3.table_import_from_s3( 'animals', '', '(FORMAT CSV, DELIMITER '','', HEADER true)', 'my-bucket', 'samples/myfile.csv', '<my_access_id>', '<my_secret_key>', '<session_token>' ); table_import_from_s3 ---------------------- 4 (1 row) psql> select * from animals; name | age ----------+----- dog | 12 cat | 15 parrot | 103 tortoise | 205 (4 rows) 

About

aws_s3 postgres extension to import/export data from/to s3 (compatible with aws_s3 extension on AWS RDS) (chimpler/postgres-aws-s3 is inactive)

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • PLpgSQL 95.6%
  • Dockerfile 2.2%
  • Makefile 1.7%
  • Shell 0.5%