Programming with Python and PostgreSQL Peter Eisentraut peter@eisentraut.org F-Secure Corporation PostgreSQL Conference East 2011 CC-BY
Partitioning • Part I: Client programming (60 min) • Part II: PL/Python (30 min)
Why Python?
Why Python? Pros: • widely used • easy • strong typing • scripting, interactive use • good PostgreSQL support • client and server (PL) interfaces • open source, community-based
Why Python? Pros: • widely used • easy • strong typing • scripting, interactive use • good PostgreSQL support • client and server (PL) interfaces • open source, community-based Pros: • no static syntax checks, must rely on test coverage • Python community has varying interest in RDBMS
Part I Client Programming
Example import psycopg2 dbconn = psycopg2.connect('dbname=dellstore2') cursor = dbconn.cursor() cursor.execute(""" SELECT firstname, lastname FROM customers ORDER BY 1, 2 LIMIT 10 """) for row in cursor.fetchall(): print "Name: %s %s" % (row[0], row[1]) cursor.close() db.close()
Drivers Name License Platforms Py Versions Psycopg LGPL Unix, Win 2.4–3.2 PyGreSQL BSD Unix, Win 2.3–2.6 ocpgdb BSD Unix 2.3–2.6 py-postgresql BSD pure Python 3.0+ bpgsql (alpha) LGPL pure Python 2.3–2.6 pg8000 BSD pure Python 2.5–3.0+
Drivers Name License Platforms Py Versions Psycopg LGPL Unix, Win 2.4–3.2 PyGreSQL BSD Unix, Win 2.3–2.6 ocpgdb BSD Unix 2.3–2.6 py-postgresql BSD pure Python 3.0+ bpgsql (alpha) LGPL pure Python 2.3–2.6 pg8000 BSD pure Python 2.5–3.0+ More details • http://wiki.postgresql.org/wiki/Python • http://wiki.python.org/moin/PostgreSQL
DB-API 2.0 • the standard Python database API • all mentioned drivers support it • defined in PEP 249 • discussions: db-sig@python.org • very elementary (from a PostgreSQL perspective) • outdated relative to Python language development • lots of extensions and incompatibilities possible
Higher-Level Interfaces • Zope • SQLAlchemy • Django
Psycopg Facts • Main authors: Federico Di Gregorio, Daniele Varrazzo • License: LGPLv3+ • Web site: http://initd.org/psycopg/ • Documentation: http://initd.org/psycopg/docs/ • Git, Gitweb • Mailing list: psycopg@postgresql.org • Twitter: @psycopg • Latest version: 2.4 (February 27, 2011)
Using the Driver import psycopg2 dbconn = psycopg2.connect(...) ...
Driver Independence? import psycopg2 dbconn = psycopg2.connect(...) # hardcodes driver name
Driver Independence? import psycopg2 as dbdriver dbconn = dbdriver.connect(...)
Driver Independence? dbtype = 'psycopg2' # e.g. from config file dbdriver = __import__(dbtype, globals(), locals(), [], -1) dbconn = dbdriver.connect(...)
Connecting # libpq-like connection string dbconn = psycopg2.connect('dbname=dellstore2 host=localhost port=5432') # same dbconn = psycopg2.connect(dsn='dbname=dellstore2 host=localhost port=5432') # keyword arguments # (not all possible libpq options supported) dbconn = psycopg2.connect(database='dellstore2', host='localhost', port='5432') DB-API 2.0 says: arguments database dependent
“Cursors” cursor = dbconn.cursor() • not a real database cursor, only an API abstraction • think “statement handle”
Server-Side Cursors cursor = dbconn.cursor(name='mycursor') • a real database cursor • use for large result sets
Executing # queries cursor.execute(""" SELECT firstname, lastname FROM customers ORDER BY 1, 2 LIMIT 10 """) # updates cursor.execute("UPDATE customers SET password = NULL") print "%d rows updated" % cursor.rowcount # or anything else cursor.execute("ANALYZE customers")
Fetching Query Results cursor.execute("SELECT firstname, lastname FROM ...") cursor.fetchall() [('AABBKO', 'DUTOFRPLOK'), ('AABTSI', 'ZFCKMPRVVJ'), ('AACOHS', 'EECCQPVTIW'), ('AACVVO', 'CLSXSGZYKS'), ('AADVMN', 'MEMQEWYFYE'), ('AADXQD', 'GLEKVVLZFV'), ('AAEBUG', 'YUOIINRJGE')]
Fetching Query Results cursor.execute("SELECT firstname, lastname FROM ...") for row in cursor.fetchall(): print "Name: %s %s" % (row[0], row[1])
Fetching Query Results cursor.execute("SELECT firstname, lastname FROM ...") for row in cursor.fetchall(): print "Name: %s %s" % (row[0], row[1]) Note: field access only by number
Fetching Query Results cursor.execute("SELECT firstname, lastname FROM ...") row = cursor.fetchone() if row is not None: print "Name: %s %s" % (row[0], row[1])
Fetching Query Results cursor.execute("SELECT firstname, lastname FROM ...") for row in cursor: print "Name: %s %s" % (row[0], row[1])
Fetching Query Results in Batches cursor = dbconn.cursor(name='mycursor') cursor.arraysize = 500 # default: 1 cursor.execute("SELECT firstname, lastname FROM ...") while True: batch = cursor.fetchmany() break if not batch for row in batch: print "Name: %s %s" % (row[0], row[1])
Fetching Query Results in Batches cursor = dbconn.cursor(name='mycursor') cursor.execute("SELECT firstname, lastname FROM ...") cursor.itersize = 2000 # default for row in cursor: print "Name: %s %s" % (row[0], row[1])
Getting Query Metadata cursor.execute("SELECT DISTINCT state, zip FROM customers") print cursor.description[0].name print cursor.description[0].type_code print cursor.description[1].name print cursor.description[1].type_code state 1043 # == psycopg2.STRING zip 23 # == psycopg2.NUMBER
Passing Parameters cursor.execute(""" UPDATE customers SET password = %s WHERE customerid = %s """, ["sekret", 37])
Passing Parameters Not to be confused with (totally evil): cursor.execute(""" UPDATE customers SET password = '%s' WHERE customerid = %d """ % ["sekret", 37])
Passing Parameters cursor.execute("INSERT INTO foo VALUES (%s)", "bar") # WRONG cursor.execute("INSERT INTO foo VALUES (%s)", ("bar")) # WRONG cursor.execute("INSERT INTO foo VALUES (%s)", ("bar",)) # correct cursor.execute("INSERT INTO foo VALUES (%s)", ["bar"]) # correct (from Psycopg documentation)
Passing Parameters cursor.execute(""" UPDATE customers SET password = %(pw)s WHERE customerid = %(id)s """, {'id': 37, 'pw': "sekret"})
Passing Many Parameter Sets cursor.executemany(""" UPDATE customers SET password = %s WHERE customerid = %s """, [["ahTh4oip", 100], ["Rexahho7", 101], ["Ee1aetui", 102]])
Calling Procedures cursor.callproc('pg_start_backup', 'label')
Data Types from decimal import Decimal from psycopg2 import Date cursor.execute(""" INSERT INTO orders (orderdate, customerid, netamount, tax, totalamount) VALUES (%s, %s, %s, %s, %s)""", [Date(2011, 03, 23), 12345, Decimal("899.95"), 8.875, Decimal("979.82")])
Mogrify from decimal import Decimal from psycopg2 import Date cursor.mogrify(""" INSERT INTO orders (orderdate, customerid, netamount, tax, totalamount) VALUES (%s, %s, %s, %s, %s)""", [Date(2011, 03, 23), 12345, Decimal("899.95"), 8.875, Decimal("979.82")]) Result: "nINSERT INTO orders (orderdate, customerid,n netamount, tax, totalamount)nVALUES ('2011-03-23'::date, 12345, 899.95, 8.875, 979.82)"
Data Types cursor.execute(""" SELECT * FROM orders WHERE customerid = 12345 """) Result: (12002, datetime.date(2011, 3, 23), 12345, Decimal('899.95'), Decimal('8.88'), Decimal('979.82'))
Nulls Input: cursor.mogrify("SELECT %s", [None]) 'SELECT NULL' Output: cursor.execute("SELECT NULL") cursor.fetchone() (None,)
Booleans cursor.mogrify("SELECT %s, %s", [True, False]) 'SELECT true, false'
Binary Data Standard way: from psycopg2 import Binary cursor.mogrify("SELECT %s", [Binary("foo")]) "SELECT E'x666f6f'::bytea"
Binary Data Standard way: from psycopg2 import Binary cursor.mogrify("SELECT %s", [Binary("foo")]) "SELECT E'x666f6f'::bytea" Other ways: cursor.mogrify("SELECT %s", [buffer("foo")]) "SELECT E'x666f6f'::bytea" cursor.mogrify("SELECT %s", [bytearray.fromhex(u"deadbeef")]) "SELECT E'xdeadbeef'::bytea" There are more. Check the documentation. Check the versions.
Date/Time Standard ways: from psycopg2 import Date, Time, Timestamp cursor.mogrify("SELECT %s, %s, %s", [Date(2011, 3, 23), Time(9, 0, 0), Timestamp(2011, 3, 23, 9, 0, 0)]) "SELECT '2011-03-23'::date, '09:00:00'::time, '2011-03-23T09:00:00'::timestamp"
Date/Time Other ways: import datetime cursor.mogrify("SELECT %s, %s, %s, %s", [datetime.date(2011, 3, 23), datetime.time(9, 0, 0), datetime.datetime(2011, 3, 23, 9, 0), datetime.timedelta(minutes=90)]) "SELECT '2011-03-23'::date, '09:00:00'::time, '2011-03-23T09:00:00'::timestamp, '0 days 5400.000000 seconds'::interval" mx.DateTime also supported
Arrays foo = [1, 2, 3] bar = [datetime.time(9, 0), datetime.time(10, 30)] cursor.mogrify("SELECT %s, %s", [foo, bar]) "SELECT ARRAY[1, 2, 3], ARRAY['09:00:00'::time, '10:30:00'::time]"
Tuples foo = (1, 2, 3) cursor.mogrify("SELECT * FROM customers WHERE customerid IN %s", [foo]) 'SELECT * FROM customers WHERE customerid IN (1, 2, 3)'
Hstore import psycopg2.extras psycopg2.extras.register_hstore(cursor) x = {'a': 'foo', 'b': 'bar'} cursor.mogrify("SELECT %s", [x]) "SELECT hstore(ARRAY[E'a', E'b'], ARRAY[E'foo', E'bar'])"
Unicode Support Cause all result strings to be returned as Unicode strings: psycopg2.extensions.register_type(psycopg2.extensions. UNICODE) psycopg2.extensions.register_type(psycopg2.extensions. UNICODEARRAY)
Transaction Control Transaction blocks are used by default. Must use dbconn.commit() or dbconn.rollback()
Transaction Control: Autocommit import psycopg2.extensions dbconn.set_isolation_level(psycopg2.extensions. ISOLATION_LEVEL_AUTOCOMMIT) cursor = dbconn.cursor() cursor.execute("VACUUM")
Transaction Control: Isolation Mode import psycopg2.extensions dbconn.set_isolation_level(psycopg2.extensions. ISOLATION_LEVEL_SERIALIZABLE) # or other level cursor = dbconn.cursor() cursor.execute(...) ... dbconn.commit()
Exception Handling StandardError |__ Warning |__ Error |__ InterfaceError |__ DatabaseError |__ DataError |__ OperationalError | |__ psycopg2.extensions.QueryCanceledError | |__ psycopg2.extensions.TransactionRollbackError |__ IntegrityError |__ InternalError |__ ProgrammingError |__ NotSupportedError
Error Messages try: cursor.execute("boom") except Exception, e: print e.pgerror
Error Codes import psycopg2.errorcodes while True: try: cursor.execute("UPDATE something ...") cursor.execute("UPDATE otherthing ...") break except Exception, e: if e.pgcode == psycopg2.errorcodes.SERIALIZATION_FAILURE: continue else: raise
Connection and Cursor Factories Want: accessing result columns by name Recall: dbconn = psycopg2.connect(dsn='...') cursor = dbconn.cursor() cursor.execute(""" SELECT firstname, lastname FROM customers ORDER BY 1, 2 LIMIT 10 """) for row in cursor.fetchall(): print "Name: %s %s" % (row[0], row[1]) # stupid :(
Connection and Cursor Factories Solution 1: Using DictConnection: import psycopg2.extras dbconn = psycopg2.connect(dsn='...', connection_factory=psycopg2.extras.DictConnection) cursor = dbconn.cursor() cursor.execute(""" SELECT firstname, lastname FROM customers ORDER BY 1, 2 LIMIT 10 """) for row in cursor.fetchall(): print "Name: %s %s" % (row['firstname'], # or row[0] row['lastname']) # or row[1]
Connection and Cursor Factories Solution 2: Using RealDictConnection: import psycopg2.extras dbconn = psycopg2.connect(dsn='...', connection_factory=psycopg2.extras.RealDictConnection) cursor = dbconn.cursor() cursor.execute(""" SELECT firstname, lastname FROM customers ORDER BY 1, 2 LIMIT 10 """) for row in cursor.fetchall(): print "Name: %s %s" % (row['firstname'], row['lastname'])
Connection and Cursor Factories Solution 3: Using NamedTupleConnection: import psycopg2.extras dbconn = psycopg2.connect(dsn='...', connection_factory=psycopg2.extras.NamedTupleConnection) cursor = dbconn.cursor() cursor.execute(""" SELECT firstname, lastname FROM customers ORDER BY 1, 2 LIMIT 10 """) for row in cursor.fetchall(): print "Name: %s %s" % (row.firstname, # or row[0] row.lastname) # or row[1]
Connection and Cursor Factories Alternative: Using DictCursor/RealDictCursor/NamedTupleCursor: import psycopg2.extras dbconn = psycopg2.connect(dsn='...') cursor = dbconn.cursor(cursor_factory=psycopg2.extras. DictCursor/RealDictCursor/NameTupleCursor) cursor.execute(""" SELECT firstname, lastname FROM customers ORDER BY 1, 2 LIMIT 10 """) for row in cursor.fetchall(): print "Name: %s %s" % (row['firstname'], row['lastname']) # (resp. row.firstname, row.lastname)
Supporting New Data Types Only a finite list of types is supported by default: Date, Binary, etc. • map new PostgreSQL data types into Python • map new Python data types into PostgreSQL
Mapping New PostgreSQL Types Into Python import psycopg2 import psycopg2.extensions def cast_oidvector(value, _cursor): """Convert oidvector to Python array""" if value is None: return None return map(int, value.split(' ')) OIDVECTOR = psycopg2.extensions.new_type((30,), 'OIDVECTOR', cast_oidvector) psycopg2.extensions.register_type(OIDVECTOR)
Mapping New Python Types into PostgreSQL from psycopg2.extensions import adapt, register_adapter, AsIs class Point(object): def __init__(self, x, y): self.x = x self.y = y def adapt_point(point): return AsIs("'(%s, %s)'" % (adapt(point.x), adapt(point.y))) register_adapter(Point, adapt_point) cur.execute("INSERT INTO atable (apoint) VALUES (%s)", (Point(1.23, 4.56),)) (from Psycopg documentation)
Connection Pooling With Psycopg from psycopg2.pool import SimpleConnectionPool pool = SimpleConnectionPool(1, 20, dsn='...') dbconn = pool.getconn() ... pool.putconn(dbconn) pool.closeall()
Connection Pooling With Psycopg for non-threaded applications: from psycopg2.pool import SimpleConnectionPool pool = SimpleConnectionPool(1, 20, dsn='...') dbconn = pool.getconn() ... pool.putconn(dbconn) pool.closeall() for non-threaded applications: from psycopg2.pool import ThreadedConnectionPool pool = ThreadedConnectionPool(1, 20, dsn='...') dbconn = pool.getconn() cursor = dbconn.cursor() ... pool.putconn(dbconn) pool.closeall()
Connection Pooling With DBUtils import psycopg2 from DBUtils.PersistentDB import PersistentDB dbconn = PersistentDB(psycopg2, dsn='...') cursor = dbconn.cursor() ... see http://pypi.python.org/pypi/DBUtils/
The Other Stuff • thread safety: can share connections, but not cursors • COPY support: cursor.copy_from(), cursor.copy_to() • large object support: connection.lobject() • 2PC: connection.xid(), connection.tpc_begin(), . . . • query cancel: dbconn.cancel() • notices: dbconn.notices • notifications: dbconn.notifies • asynchronous communication • coroutine support • logging cursor
Part II PL/Python
Setup • included with PostgreSQL • configure --with-python • apt-get/yum install postgresql-plpython • CREATE LANGUAGE plpythonu; • Python 3: CREATE LANGUAGE plpython3u; • “untrusted”, superuser only
Basic Examples CREATE FUNCTION add(a int, b int) RETURNS int LANGUAGE plpythonu AS $$ return a + b $$; CREATE FUNCTION longest(a text, b text) RETURNS text LANGUAGE plpythonu AS $$ if len(a) > len(b): return a elif len(b) > len(a): return b else: return None $$;
Using Modules CREATE FUNCTION json_to_array(j text) RETURNS text[] LANGUAGE plpythonu AS $$ import json return json.loads(j) $$;
Database Calls CREATE FUNCTION clear_passwords() RETURNS int LANGUAGE plpythonu AS $$ rv = plpy.execute("UPDATE customers SET password = NULL") return rv.nrows $$;
Database Calls With Parameters CREATE FUNCTION set_password(username text, password text) RETURNS boolean LANGUAGE plpythonu AS $$ plan = plpy.prepare("UPDATE customers SET password = $1 WHERE username= $2", ['text', 'text']) rv = plpy.execute(plan, [username, password]) return rv.nrows == 1 $$;
Avoiding Prepared Statements CREATE FUNCTION set_password(username text, password text) RETURNS boolean LANGUAGE plpythonu AS $$ rv = plpy.execute("UPDATE customers SET password = %s WHERE username= %s" % (plpy.quote_nullable(username), plpy.quote_literal(password))) return rv.nrows == 1 $$; (available in 9.1-to-be)
Caching Plans CREATE FUNCTION set_password2(username text, password text) RETURNS boolean LANGUAGE plpythonu AS $$ if 'myplan' in SD: plan = SD['myplan'] else: plan = plpy.prepare("UPDATE customers SET password = $1 WHERE username= $2", ['text', 'text']) SD['myplan'] = plan rv = plpy.execute(plan, [username, password]) return rv.nrows == 1 $$;
Processing Query Results CREATE FUNCTION get_customer_name(username text) RETURNS boolean LANGUAGE plpythonu AS $$ plan = plpy.prepare("SELECT firstname || ' ' || lastname AS ""name"" FROM customers WHERE username = $1", ['text']) rv = plpy.execute(plan, [username], 1) return rv[0]['name'] $$;
Compare: PL/Python vs. DB-API PL/Python: plan = plpy.prepare("SELECT ...") for row in plpy.execute(plan, ...): plpy.info(row["fieldname"]) DB-API: dbconn = psycopg2.connect(...) cursor = dbconn.cursor() cursor.execute("SELECT ...") for row in cursor.fetchall() do: print row[0]
Set-Returning and Table Functions CREATE FUNCTION get_customers(id int) RETURNS SETOF customers LANGUAGE plpythonu AS $$ plan = plpy.prepare("SELECT * FROM customers WHERE customerid = $1", ['int']) rv = plpy.execute(plan, [id]) return rv $$;
Triggers CREATE FUNCTION delete_notifier() RETURNS trigger LANGUAGE plpythonu AS $$ if TD['event'] == 'DELETE': plpy.notice("one row deleted from table %s" % TD['table_name']) $$; CREATE TRIGGER customers_delete_notifier AFTER DELETE ON customers FOR EACH ROW EXECUTE PROCEDURE delete_notifier();
Exceptions CREATE FUNCTION test() RETURNS text LANGUAGE plpythonu AS $$ try: rv = plpy.execute("SELECT ...") except plpy.SPIError, e: plpy.notice("something went wrong") The transaction is still aborted in < 9.1.
New in PostgreSQL 9.1 • SPI calls wrapped in subtransactions • custom SPI exceptions: subclass per SQLSTATE, .sqlstate attribute • plpy.subtransaction() context manager • support for OUT parameters • quoting functions • validator • lots of internal improvements
The End

Programming with Python and PostgreSQL

  • 1.
    Programming with Python and PostgreSQL Peter Eisentraut peter@eisentraut.org F-Secure Corporation PostgreSQL Conference East 2011 CC-BY
  • 2.
    Partitioning • Part I: Client programming (60 min) • Part II: PL/Python (30 min)
  • 3.
  • 4.
    Why Python? Pros: • widely used • easy • strong typing • scripting, interactive use • good PostgreSQL support • client and server (PL) interfaces • open source, community-based
  • 5.
    Why Python? Pros: • widely used • easy • strong typing • scripting, interactive use • good PostgreSQL support • client and server (PL) interfaces • open source, community-based Pros: • no static syntax checks, must rely on test coverage • Python community has varying interest in RDBMS
  • 6.
  • 7.
    Example import psycopg2 dbconn = psycopg2.connect('dbname=dellstore2') cursor = dbconn.cursor() cursor.execute(""" SELECT firstname, lastname FROM customers ORDER BY 1, 2 LIMIT 10 """) for row in cursor.fetchall(): print "Name: %s %s" % (row[0], row[1]) cursor.close() db.close()
  • 8.
    Drivers Name License Platforms Py Versions Psycopg LGPL Unix, Win 2.4–3.2 PyGreSQL BSD Unix, Win 2.3–2.6 ocpgdb BSD Unix 2.3–2.6 py-postgresql BSD pure Python 3.0+ bpgsql (alpha) LGPL pure Python 2.3–2.6 pg8000 BSD pure Python 2.5–3.0+
  • 9.
    Drivers Name License Platforms Py Versions Psycopg LGPL Unix, Win 2.4–3.2 PyGreSQL BSD Unix, Win 2.3–2.6 ocpgdb BSD Unix 2.3–2.6 py-postgresql BSD pure Python 3.0+ bpgsql (alpha) LGPL pure Python 2.3–2.6 pg8000 BSD pure Python 2.5–3.0+ More details • http://wiki.postgresql.org/wiki/Python • http://wiki.python.org/moin/PostgreSQL
  • 10.
    DB-API 2.0 • the standard Python database API • all mentioned drivers support it • defined in PEP 249 • discussions: db-sig@python.org • very elementary (from a PostgreSQL perspective) • outdated relative to Python language development • lots of extensions and incompatibilities possible
  • 11.
    Higher-Level Interfaces • Zope • SQLAlchemy • Django
  • 12.
    Psycopg Facts • Main authors: Federico Di Gregorio, Daniele Varrazzo • License: LGPLv3+ • Web site: http://initd.org/psycopg/ • Documentation: http://initd.org/psycopg/docs/ • Git, Gitweb • Mailing list: psycopg@postgresql.org • Twitter: @psycopg • Latest version: 2.4 (February 27, 2011)
  • 13.
    Using the Driver import psycopg2 dbconn = psycopg2.connect(...) ...
  • 14.
    Driver Independence? importpsycopg2 dbconn = psycopg2.connect(...) # hardcodes driver name
  • 15.
    Driver Independence? importpsycopg2 as dbdriver dbconn = dbdriver.connect(...)
  • 16.
    Driver Independence? dbtype= 'psycopg2' # e.g. from config file dbdriver = __import__(dbtype, globals(), locals(), [], -1) dbconn = dbdriver.connect(...)
  • 17.
    Connecting # libpq-likeconnection string dbconn = psycopg2.connect('dbname=dellstore2 host=localhost port=5432') # same dbconn = psycopg2.connect(dsn='dbname=dellstore2 host=localhost port=5432') # keyword arguments # (not all possible libpq options supported) dbconn = psycopg2.connect(database='dellstore2', host='localhost', port='5432') DB-API 2.0 says: arguments database dependent
  • 18.
    “Cursors” cursor= dbconn.cursor() • not a real database cursor, only an API abstraction • think “statement handle”
  • 19.
    Server-Side Cursors cursor = dbconn.cursor(name='mycursor') • a real database cursor • use for large result sets
  • 20.
    Executing # queries cursor.execute(""" SELECT firstname, lastname FROM customers ORDER BY 1, 2 LIMIT 10 """) # updates cursor.execute("UPDATE customers SET password = NULL") print "%d rows updated" % cursor.rowcount # or anything else cursor.execute("ANALYZE customers")
  • 21.
    Fetching Query Results cursor.execute("SELECT firstname, lastname FROM ...") cursor.fetchall() [('AABBKO', 'DUTOFRPLOK'), ('AABTSI', 'ZFCKMPRVVJ'), ('AACOHS', 'EECCQPVTIW'), ('AACVVO', 'CLSXSGZYKS'), ('AADVMN', 'MEMQEWYFYE'), ('AADXQD', 'GLEKVVLZFV'), ('AAEBUG', 'YUOIINRJGE')]
  • 22.
    Fetching Query Results cursor.execute("SELECT firstname, lastname FROM ...") for row in cursor.fetchall(): print "Name: %s %s" % (row[0], row[1])
  • 23.
    Fetching Query Results cursor.execute("SELECT firstname, lastname FROM ...") for row in cursor.fetchall(): print "Name: %s %s" % (row[0], row[1]) Note: field access only by number
  • 24.
    Fetching Query Results cursor.execute("SELECT firstname, lastname FROM ...") row = cursor.fetchone() if row is not None: print "Name: %s %s" % (row[0], row[1])
  • 25.
    Fetching Query Results cursor.execute("SELECT firstname, lastname FROM ...") for row in cursor: print "Name: %s %s" % (row[0], row[1])
  • 26.
    Fetching Query Resultsin Batches cursor = dbconn.cursor(name='mycursor') cursor.arraysize = 500 # default: 1 cursor.execute("SELECT firstname, lastname FROM ...") while True: batch = cursor.fetchmany() break if not batch for row in batch: print "Name: %s %s" % (row[0], row[1])
  • 27.
    Fetching Query Resultsin Batches cursor = dbconn.cursor(name='mycursor') cursor.execute("SELECT firstname, lastname FROM ...") cursor.itersize = 2000 # default for row in cursor: print "Name: %s %s" % (row[0], row[1])
  • 28.
    Getting Query Metadata cursor.execute("SELECT DISTINCT state, zip FROM customers") print cursor.description[0].name print cursor.description[0].type_code print cursor.description[1].name print cursor.description[1].type_code state 1043 # == psycopg2.STRING zip 23 # == psycopg2.NUMBER
  • 29.
    Passing Parameters cursor.execute(""" UPDATE customers SET password = %s WHERE customerid = %s """, ["sekret", 37])
  • 30.
    Passing Parameters Notto be confused with (totally evil): cursor.execute(""" UPDATE customers SET password = '%s' WHERE customerid = %d """ % ["sekret", 37])
  • 31.
    Passing Parameters cursor.execute("INSERTINTO foo VALUES (%s)", "bar") # WRONG cursor.execute("INSERT INTO foo VALUES (%s)", ("bar")) # WRONG cursor.execute("INSERT INTO foo VALUES (%s)", ("bar",)) # correct cursor.execute("INSERT INTO foo VALUES (%s)", ["bar"]) # correct (from Psycopg documentation)
  • 32.
    Passing Parameters cursor.execute(""" UPDATE customers SET password = %(pw)s WHERE customerid = %(id)s """, {'id': 37, 'pw': "sekret"})
  • 33.
    Passing Many ParameterSets cursor.executemany(""" UPDATE customers SET password = %s WHERE customerid = %s """, [["ahTh4oip", 100], ["Rexahho7", 101], ["Ee1aetui", 102]])
  • 34.
    Calling Procedures cursor.callproc('pg_start_backup', 'label')
  • 35.
    Data Types fromdecimal import Decimal from psycopg2 import Date cursor.execute(""" INSERT INTO orders (orderdate, customerid, netamount, tax, totalamount) VALUES (%s, %s, %s, %s, %s)""", [Date(2011, 03, 23), 12345, Decimal("899.95"), 8.875, Decimal("979.82")])
  • 36.
    Mogrify fromdecimal import Decimal from psycopg2 import Date cursor.mogrify(""" INSERT INTO orders (orderdate, customerid, netamount, tax, totalamount) VALUES (%s, %s, %s, %s, %s)""", [Date(2011, 03, 23), 12345, Decimal("899.95"), 8.875, Decimal("979.82")]) Result: "nINSERT INTO orders (orderdate, customerid,n netamount, tax, totalamount)nVALUES ('2011-03-23'::date, 12345, 899.95, 8.875, 979.82)"
  • 37.
    Data Types cursor.execute(""" SELECT * FROM orders WHERE customerid = 12345 """) Result: (12002, datetime.date(2011, 3, 23), 12345, Decimal('899.95'), Decimal('8.88'), Decimal('979.82'))
  • 38.
    Nulls Input: cursor.mogrify("SELECT %s", [None]) 'SELECT NULL' Output: cursor.execute("SELECT NULL") cursor.fetchone() (None,)
  • 39.
    Booleans cursor.mogrify("SELECT %s,%s", [True, False]) 'SELECT true, false'
  • 40.
    Binary Data Standard way: from psycopg2 import Binary cursor.mogrify("SELECT %s", [Binary("foo")]) "SELECT E'x666f6f'::bytea"
  • 41.
    Binary Data Standard way: from psycopg2 import Binary cursor.mogrify("SELECT %s", [Binary("foo")]) "SELECT E'x666f6f'::bytea" Other ways: cursor.mogrify("SELECT %s", [buffer("foo")]) "SELECT E'x666f6f'::bytea" cursor.mogrify("SELECT %s", [bytearray.fromhex(u"deadbeef")]) "SELECT E'xdeadbeef'::bytea" There are more. Check the documentation. Check the versions.
  • 42.
    Date/Time Standard ways: from psycopg2 import Date, Time, Timestamp cursor.mogrify("SELECT %s, %s, %s", [Date(2011, 3, 23), Time(9, 0, 0), Timestamp(2011, 3, 23, 9, 0, 0)]) "SELECT '2011-03-23'::date, '09:00:00'::time, '2011-03-23T09:00:00'::timestamp"
  • 43.
    Date/Time Other ways: import datetime cursor.mogrify("SELECT %s, %s, %s, %s", [datetime.date(2011, 3, 23), datetime.time(9, 0, 0), datetime.datetime(2011, 3, 23, 9, 0), datetime.timedelta(minutes=90)]) "SELECT '2011-03-23'::date, '09:00:00'::time, '2011-03-23T09:00:00'::timestamp, '0 days 5400.000000 seconds'::interval" mx.DateTime also supported
  • 44.
    Arrays foo= [1, 2, 3] bar = [datetime.time(9, 0), datetime.time(10, 30)] cursor.mogrify("SELECT %s, %s", [foo, bar]) "SELECT ARRAY[1, 2, 3], ARRAY['09:00:00'::time, '10:30:00'::time]"
  • 45.
    Tuples foo =(1, 2, 3) cursor.mogrify("SELECT * FROM customers WHERE customerid IN %s", [foo]) 'SELECT * FROM customers WHERE customerid IN (1, 2, 3)'
  • 46.
    Hstore import psycopg2.extras psycopg2.extras.register_hstore(cursor) x = {'a': 'foo', 'b': 'bar'} cursor.mogrify("SELECT %s", [x]) "SELECT hstore(ARRAY[E'a', E'b'], ARRAY[E'foo', E'bar'])"
  • 47.
    Unicode Support Causeall result strings to be returned as Unicode strings: psycopg2.extensions.register_type(psycopg2.extensions. UNICODE) psycopg2.extensions.register_type(psycopg2.extensions. UNICODEARRAY)
  • 48.
    Transaction Control Transaction blocks are used by default. Must use dbconn.commit() or dbconn.rollback()
  • 49.
    Transaction Control: Autocommit import psycopg2.extensions dbconn.set_isolation_level(psycopg2.extensions. ISOLATION_LEVEL_AUTOCOMMIT) cursor = dbconn.cursor() cursor.execute("VACUUM")
  • 50.
    Transaction Control: IsolationMode import psycopg2.extensions dbconn.set_isolation_level(psycopg2.extensions. ISOLATION_LEVEL_SERIALIZABLE) # or other level cursor = dbconn.cursor() cursor.execute(...) ... dbconn.commit()
  • 51.
    Exception Handling StandardError |__ Warning |__ Error |__ InterfaceError |__ DatabaseError |__ DataError |__ OperationalError | |__ psycopg2.extensions.QueryCanceledError | |__ psycopg2.extensions.TransactionRollbackError |__ IntegrityError |__ InternalError |__ ProgrammingError |__ NotSupportedError
  • 52.
    Error Messages try: cursor.execute("boom") except Exception, e: print e.pgerror
  • 53.
    Error Codes importpsycopg2.errorcodes while True: try: cursor.execute("UPDATE something ...") cursor.execute("UPDATE otherthing ...") break except Exception, e: if e.pgcode == psycopg2.errorcodes.SERIALIZATION_FAILURE: continue else: raise
  • 54.
    Connection and CursorFactories Want: accessing result columns by name Recall: dbconn = psycopg2.connect(dsn='...') cursor = dbconn.cursor() cursor.execute(""" SELECT firstname, lastname FROM customers ORDER BY 1, 2 LIMIT 10 """) for row in cursor.fetchall(): print "Name: %s %s" % (row[0], row[1]) # stupid :(
  • 55.
    Connection and CursorFactories Solution 1: Using DictConnection: import psycopg2.extras dbconn = psycopg2.connect(dsn='...', connection_factory=psycopg2.extras.DictConnection) cursor = dbconn.cursor() cursor.execute(""" SELECT firstname, lastname FROM customers ORDER BY 1, 2 LIMIT 10 """) for row in cursor.fetchall(): print "Name: %s %s" % (row['firstname'], # or row[0] row['lastname']) # or row[1]
  • 56.
    Connection and CursorFactories Solution 2: Using RealDictConnection: import psycopg2.extras dbconn = psycopg2.connect(dsn='...', connection_factory=psycopg2.extras.RealDictConnection) cursor = dbconn.cursor() cursor.execute(""" SELECT firstname, lastname FROM customers ORDER BY 1, 2 LIMIT 10 """) for row in cursor.fetchall(): print "Name: %s %s" % (row['firstname'], row['lastname'])
  • 57.
    Connection and CursorFactories Solution 3: Using NamedTupleConnection: import psycopg2.extras dbconn = psycopg2.connect(dsn='...', connection_factory=psycopg2.extras.NamedTupleConnection) cursor = dbconn.cursor() cursor.execute(""" SELECT firstname, lastname FROM customers ORDER BY 1, 2 LIMIT 10 """) for row in cursor.fetchall(): print "Name: %s %s" % (row.firstname, # or row[0] row.lastname) # or row[1]
  • 58.
    Connection and CursorFactories Alternative: Using DictCursor/RealDictCursor/NamedTupleCursor: import psycopg2.extras dbconn = psycopg2.connect(dsn='...') cursor = dbconn.cursor(cursor_factory=psycopg2.extras. DictCursor/RealDictCursor/NameTupleCursor) cursor.execute(""" SELECT firstname, lastname FROM customers ORDER BY 1, 2 LIMIT 10 """) for row in cursor.fetchall(): print "Name: %s %s" % (row['firstname'], row['lastname']) # (resp. row.firstname, row.lastname)
  • 59.
    Supporting New DataTypes Only a finite list of types is supported by default: Date, Binary, etc. • map new PostgreSQL data types into Python • map new Python data types into PostgreSQL
  • 60.
    Mapping New PostgreSQLTypes Into Python import psycopg2 import psycopg2.extensions def cast_oidvector(value, _cursor): """Convert oidvector to Python array""" if value is None: return None return map(int, value.split(' ')) OIDVECTOR = psycopg2.extensions.new_type((30,), 'OIDVECTOR', cast_oidvector) psycopg2.extensions.register_type(OIDVECTOR)
  • 61.
    Mapping New PythonTypes into PostgreSQL from psycopg2.extensions import adapt, register_adapter, AsIs class Point(object): def __init__(self, x, y): self.x = x self.y = y def adapt_point(point): return AsIs("'(%s, %s)'" % (adapt(point.x), adapt(point.y))) register_adapter(Point, adapt_point) cur.execute("INSERT INTO atable (apoint) VALUES (%s)", (Point(1.23, 4.56),)) (from Psycopg documentation)
  • 62.
    Connection Pooling WithPsycopg from psycopg2.pool import SimpleConnectionPool pool = SimpleConnectionPool(1, 20, dsn='...') dbconn = pool.getconn() ... pool.putconn(dbconn) pool.closeall()
  • 63.
    Connection Pooling WithPsycopg for non-threaded applications: from psycopg2.pool import SimpleConnectionPool pool = SimpleConnectionPool(1, 20, dsn='...') dbconn = pool.getconn() ... pool.putconn(dbconn) pool.closeall() for non-threaded applications: from psycopg2.pool import ThreadedConnectionPool pool = ThreadedConnectionPool(1, 20, dsn='...') dbconn = pool.getconn() cursor = dbconn.cursor() ... pool.putconn(dbconn) pool.closeall()
  • 64.
    Connection Pooling WithDBUtils import psycopg2 from DBUtils.PersistentDB import PersistentDB dbconn = PersistentDB(psycopg2, dsn='...') cursor = dbconn.cursor() ... see http://pypi.python.org/pypi/DBUtils/
  • 65.
    The Other Stuff • thread safety: can share connections, but not cursors • COPY support: cursor.copy_from(), cursor.copy_to() • large object support: connection.lobject() • 2PC: connection.xid(), connection.tpc_begin(), . . . • query cancel: dbconn.cancel() • notices: dbconn.notices • notifications: dbconn.notifies • asynchronous communication • coroutine support • logging cursor
  • 66.
  • 67.
    Setup • included with PostgreSQL • configure --with-python • apt-get/yum install postgresql-plpython • CREATE LANGUAGE plpythonu; • Python 3: CREATE LANGUAGE plpython3u; • “untrusted”, superuser only
  • 68.
    Basic Examples CREATEFUNCTION add(a int, b int) RETURNS int LANGUAGE plpythonu AS $$ return a + b $$; CREATE FUNCTION longest(a text, b text) RETURNS text LANGUAGE plpythonu AS $$ if len(a) > len(b): return a elif len(b) > len(a): return b else: return None $$;
  • 69.
    Using Modules CREATEFUNCTION json_to_array(j text) RETURNS text[] LANGUAGE plpythonu AS $$ import json return json.loads(j) $$;
  • 70.
    Database Calls CREATEFUNCTION clear_passwords() RETURNS int LANGUAGE plpythonu AS $$ rv = plpy.execute("UPDATE customers SET password = NULL") return rv.nrows $$;
  • 71.
    Database Calls WithParameters CREATE FUNCTION set_password(username text, password text) RETURNS boolean LANGUAGE plpythonu AS $$ plan = plpy.prepare("UPDATE customers SET password = $1 WHERE username= $2", ['text', 'text']) rv = plpy.execute(plan, [username, password]) return rv.nrows == 1 $$;
  • 72.
    Avoiding Prepared Statements CREATE FUNCTION set_password(username text, password text) RETURNS boolean LANGUAGE plpythonu AS $$ rv = plpy.execute("UPDATE customers SET password = %s WHERE username= %s" % (plpy.quote_nullable(username), plpy.quote_literal(password))) return rv.nrows == 1 $$; (available in 9.1-to-be)
  • 73.
    Caching Plans CREATEFUNCTION set_password2(username text, password text) RETURNS boolean LANGUAGE plpythonu AS $$ if 'myplan' in SD: plan = SD['myplan'] else: plan = plpy.prepare("UPDATE customers SET password = $1 WHERE username= $2", ['text', 'text']) SD['myplan'] = plan rv = plpy.execute(plan, [username, password]) return rv.nrows == 1 $$;
  • 74.
    Processing Query Results CREATE FUNCTION get_customer_name(username text) RETURNS boolean LANGUAGE plpythonu AS $$ plan = plpy.prepare("SELECT firstname || ' ' || lastname AS ""name"" FROM customers WHERE username = $1", ['text']) rv = plpy.execute(plan, [username], 1) return rv[0]['name'] $$;
  • 75.
    Compare: PL/Python vs.DB-API PL/Python: plan = plpy.prepare("SELECT ...") for row in plpy.execute(plan, ...): plpy.info(row["fieldname"]) DB-API: dbconn = psycopg2.connect(...) cursor = dbconn.cursor() cursor.execute("SELECT ...") for row in cursor.fetchall() do: print row[0]
  • 76.
    Set-Returning and TableFunctions CREATE FUNCTION get_customers(id int) RETURNS SETOF customers LANGUAGE plpythonu AS $$ plan = plpy.prepare("SELECT * FROM customers WHERE customerid = $1", ['int']) rv = plpy.execute(plan, [id]) return rv $$;
  • 77.
    Triggers CREATEFUNCTION delete_notifier() RETURNS trigger LANGUAGE plpythonu AS $$ if TD['event'] == 'DELETE': plpy.notice("one row deleted from table %s" % TD['table_name']) $$; CREATE TRIGGER customers_delete_notifier AFTER DELETE ON customers FOR EACH ROW EXECUTE PROCEDURE delete_notifier();
  • 78.
    Exceptions CREATE FUNCTIONtest() RETURNS text LANGUAGE plpythonu AS $$ try: rv = plpy.execute("SELECT ...") except plpy.SPIError, e: plpy.notice("something went wrong") The transaction is still aborted in < 9.1.
  • 79.
    New in PostgreSQL9.1 • SPI calls wrapped in subtransactions • custom SPI exceptions: subclass per SQLSTATE, .sqlstate attribute • plpy.subtransaction() context manager • support for OUT parameters • quoting functions • validator • lots of internal improvements
  • 80.