Pro PostgreSQL Robert Treat omniti.com brighterlamp.org
Who Am I? (Why Listen To Me) A-0 PostgreSQL User Since 6.5.x DBA of High Traffic / Large PostgreSQL Instances Long Time Contributor to PostgreSQL Project Contribute / Maintain Several Open Source Projects Co-Author Beginning PHP & PostgreSQL 8 (Apress)
Outline A-1 Installation Upgrading Configuration Routine maintenance Replication / Availability Advanced SQL Query tuning Indexing Tablespaces Partitioning http://pgfoundry.org/projects/dbsamples/
C-0 Get Off To A Good Start
C-1 Get Off To A Good Start Use package management Consistent Standardized Simple
Different across systems Upgrades are an issue Trust your packager? C-2 Get Off To A Good Start Use package management
Different across systems Upgrades are an issue Trust your packager? C-3 Get Off To A Good Start Use package management Don't Be Afraid To Roll Your Own
C-4 Get Off To A Good Start $PGDATA/pg_log /var/log/pgsql when in doubt... (postgresql.conf) Configure Logging Logging is often overlooked, but is the first step toward troubleshooting!
C-5 Get Off To A Good Start most systems have different defaults firewalls/ selinux (FATAL) rtfm (pg_hba.conf, grant, revoke) Configure Authentication
C-6 Get Off To A Good Start TRUST md5 IDENT Authentication Methods
C-7 Get Off To A Good Start trust these more than your own code package dependent use different schemas tsearch2, pgcrypto pgstatstuple, pg_buffercache, pg_freespacemap /contrib
C-8 Get Off To A Good Start package dependent some are non-core (plruby, plr, plphp) varying functionality varying levels of trust don't be afraid, test! procedural languages
D-0 Let's Talk About Upgrades
D-1 Let's Talk About Upgrades Versioning First Digit ( 7 .4.16 -> 8 .2.0) Second Digit (8.2.4 -> 8.3.0) Third Digit (8.3.0 -> 8.3.1)
D-2 Let's Talk About Upgrades Versioning First Digit (7.4.16 -> 8.2.0) Second Digit (8. 2 .4 -> 8. 3 .0) Third Digit (8.3.0 -> 8.3.1)
D-3 Let's Talk About Upgrades Versioning First Digit (7.4.16 -> 8.2.0) Second Digit (8.2.4 -> 8.3.0) Third Digit (8.3. 0 -> 8.3. 1 )
D-4 Let's Talk About Upgrades Achtung!! Make Backups! Read the Release Notes!
D-5 Let's Talk About Upgrades pg_dump/pg_restore simple -Fc is your friend dump with new version of pg_dump pitfalls (time, hdd)
D-6 Let's Talk About Upgrades the slony method not simple create slave on new version switchover (switch back?) pitfalls (initial synch, compatibility)
D-7 Let's Talk About Upgrades pg_migrator in place upgrades rewrites system catalog info no way to go back (fs snapshots) still new, getting better 8.1 -> 8.2 only (for now)
D-8 Let's Talk About Upgrades upgrading older db <= 7.2 is no longer supported (upgrade now!) pg_dump 8.2 has issues with <= 7.2 you can upgrade to 7.3 first use adddepends on 7.3 install slony requires 7.3 (or 7.4) or newer pg_migrator (lol)
E-0 Figure Your Configure
E-1 Figure Your Configure the basics : performance effective_cache_size shared_buffers default_statistics_target sort_mem checkpoint_segments checkpoint_timeout
E-2 Figure Your Configure the basics : logging stderr/pg_log vs. syslog/eventlog log_min_error_statement (error!) log_min_duration_statement log_line_prefix (%d, %p, %t)
E-5 Figure Your Configure other stuff worth looking at maintenance_work_mem max_prepared_transactions update_process_title max_fsm_pages
P-0 Routine Maintenance
P-1 Routine Maintenance a word about vacuum reclaim usable space update table stats avoid xid wraparound
P-2 Routine Maintenance autovacuum : just do it! autovacuum stats_start_collector stats_row_level pg_autovacuum ?
P-3 Routine Maintenance other stuff worth looking at reindexing logfiles backups failover
G-0 Availability
G-1 Availability what do we mean by availability? not backups (exactly) not replication (necessarily) not clustering (even less so)
G-2 Availability what do we mean by availability? if (kablooy) then (ok) not backups (exactly) not replication (necessarily) not clustering (even less so)
G-3 Availability pg_dump traditionally used for backups send dump to another server constantly run restore process large time, i/o constraints
G-4 Availability pitr create second, standby server ship wal logs to new server less time/io than pg_dump 8.1 -> cold standby 8.2 -> warm standby 8.4 -> hot standby ?
G-5 Availability slony asynchronous, master-slave replication controlled switchover, failover low i/o, time constraints other benefits (upgrades, scaling)
G-5 Availability bucardo asynchronous, multi-master replication conflict resolution low i/o, time constraints other benefits (upgrades, scaling)
G-6 Availability shared disk one copy of PGDATA on shared storage standby takes over akin to db crash shared disk is point of failure (raid) must ensure only one postmaster running
G-7 Availability filesystem replication drbd, zfs filesystem mirrored between servers synchronized, ordered writes single disk system?
G-8 Availability pgpool dual-master, statement based little caveats (random(),now(),sequences) bigger caveats (security, password, pg_hba) pgpool becomes failure point
I-0 Beyond Simple SQL
I-1 Beyond Simple SQL generate series pagila=# select * from generate_series(1,3); generate_series ----------------- 1 2 3 (3 rows) behold the power of loops!
I-2 Beyond Simple SQL generate series extrapolate many uses pagila=# select '2007-05-22 09:00:00'::timestamp - x * '1 hour'::interval as countdown from generate_series(1,10) x; countdown --------------------- 2007-05-22 08:00:00 2007-05-22 07:00:00 2007-05-22 06:00:00 2007-05-22 05:00:00 2007-05-22 04:00:00 2007-05-22 03:00:00 2007-05-22 02:00:00 2007-05-22 01:00:00 2007-05-22 00:00:00 2007-05-21 23:00:00 (10 rows)
I-3 Beyond Simple SQL rownum()
I-4 Beyond Simple SQL row numbering
I-5 Beyond Simple SQL row numbering pagila=# select (select count(*)+1 from customer where customer_id < x.customer_id) as rownum, customer_id, first_name, last_name from customer x limit 5; rownum | customer_id | first_name | last_name --------+-------------+------------+----------- 1 | 1 | MARY | SMITH 2 | 3 | LINDA | WILLIAMS 3 | 4 | BARBARA | JONES 4 | 5 | ELIZABETH | BROWN 5 | 6 | JENNIFER | DAVIS (5 rows)
I-6 Beyond Simple SQL row numbering pagila=# select (select count(*)+1 from customer where customer_id < x.customer_id) as rownum, customer_id, first_name, last_name from customer x limit 5; rownum | customer_id | first_name | last_name --------+-------------+------------+----------- 1 | 1 | MARY | SMITH 2 | 3 | LINDA | WILLIAMS 3 | 4 | BARBARA | JONES 4 | 5 | ELIZABETH | BROWN 5 | 6 | JENNIFER | DAVIS (5 rows)
I-7 Beyond Simple SQL row numbering pagila=# select (select count(*)+1 from customer where customer_id < x.customer_id) as rownum, customer_id, first_name, last_name from customer x limit 5; rownum | customer_id | first_name | last_name --------+-------------+------------+----------- 1 | 1 | MARY | SMITH 2 | 3 | LINDA | WILLIAMS 3 | 4 | BARBARA | JONES 4 | 5 | ELIZABETH | BROWN 5 | 6 | JENNIFER | DAVIS (5 rows) fill in missing gaps
I-8 Beyond Simple SQL row numbering watch out for order by pagila=# select (select count(*)+1 from customer where last_name < x.last_name) as rownum, first_name, last_name from customer x order by last_name limit 5; rownum | first_name | last_name --------+------------+----------- 1 | RAFAEL | ABNEY 2 | NATHANIEL | ADAM 3 | KATHLEEN | ADAMS 4 | DIANA | ALEXANDER 5 | GORDON | ALLARD (5 rows)
I-9 Beyond Simple SQL rollup() pagila=# select customer_id, amount from payment limit 10; customer_id | amount -------------+-------- 267 | 7.98 267 | 0.00 269 | 3.98 269 | 0.00 274 | 0.99 279 | 4.99 282 | 0.99 284 | 5.98 284 | 0.00 287 | 0.99 (10 rows)
I-10 Beyond Simple SQL rollup() pagila=# select customer_id, amount, total from payment JOIN (select customer_id, sum(amount) as total from payment group by customer_id) x using (customer_id) limit 10; customer_id | amount | total -------------+--------+-------- 267 | 7.98 | 159.64 267 | 0.00 | 159.64 269 | 3.98 | 129.70 269 | 0.00 | 129.70 274 | 0.99 | 152.65 279 | 4.99 | 135.69 282 | 0.99 | 103.73 284 | 5.98 | 126.72 284 | 0.00 | 126.72 287 | 0.99 | 115.71 (10 rows) Include totals with given row
I-11 Beyond Simple SQL rank() SELECT * FROM (select c1.first_name, c1.last_name, c1.store_id, p1.total, (select 1 + count(*) from customer c2 join (select customer_id, sum(amount) as total from only payment group by customer_id) p2 using (customer_id) where c2.store_id = c1.store_id and p2.total > p1.total) as rank from customer c1 join (select customer_id, sum(amount) as total from only payment group by customer_id) p1 using (customer_id) ) x WHERE x.rank <= 3 ORDER BY x.store_id, x.rank; first_name | last_name | store_id | total | rank ------------+-----------+----------+--------+------ ELEANOR | HUNT | 1 | 216.54 | 1 CLARA | SHAW | 1 | 195.58 | 2 TOMMY | COLLAZO | 1 | 186.62 | 3 KARL | SEAL | 2 | 221.55 | 1 MARION | SNYDER | 2 | 194.61 | 2 RHONDA | KENNEDY | 2 | 194.61 | 2 (6 rows)
I-12 Beyond Simple SQL SELECT * FROM (SELECT c1.first_name, c1.last_name, c1.store_id, p1.total, (SELECT 1 + count(*) FROM customer c2 JOIN (SELECT customer_id, sum(amount) AS total FROM payment GROUP BY customer_id) p2 USING (customer_id) WHERE c2.store_id = c1.store_id AND p2.total > p1.total ) AS rank FROM customer c1 JOIN (SELECT customer_id, sum(amount) AS total FROM payment GROUP BY customer_id) p1 USING (customer_id) ) x WHERE x.rank <= 3 ORDER BY x.store_id, x.rank;
I-13 Beyond Simple SQL SELECT * FROM (SELECT c1.first_name, c1.last_name, c1.store_id, p1.total, (SELECT 1 + count(*) FROM customer c2 JOIN (SELECT customer_id, sum(amount) AS total FROM payment GROUP BY customer_id) p2 USING (customer_id) WHERE c2.store_id = c1.store_id AND p2.total > p1.total ) AS rank FROM customer c1 JOIN (SELECT customer_id, sum(amount) AS total FROM payment GROUP BY customer_id) p1 USING (customer_id) ) x WHERE x.rank <= 3 ORDER BY x.store_id, x.rank;
I-14 Beyond Simple SQL SELECT * FROM (SELECT c1.first_name, c1.last_name, c1.store_id, p1.total, (SELECT 1 + count(*) FROM customer c2 JOIN (SELECT customer_id, sum(amount) AS total FROM payment GROUP BY customer_id) p2 USING (customer_id) WHERE c2.store_id = c1.store_id AND p2.total > p1.total ) AS rank FROM customer c1 JOIN (SELECT customer_id, sum(amount) AS total FROM payment GROUP BY customer_id) p1 USING (customer_id) ) x WHERE x.rank <= 3 ORDER BY x.store_id, x.rank;
I-15 Beyond Simple SQL SELECT * FROM (SELECT c1.first_name, c1.last_name, c1.store_id, p1.total, (SELECT 1 + count(*) FROM customer c2 JOIN (SELECT customer_id, sum(amount) AS total FROM payment GROUP BY customer_id) p2 USING (customer_id) WHERE c2.store_id = c1.store_id AND p2.total > p1.total ) AS rank FROM customer c1 JOIN (SELECT customer_id, sum(amount) AS total FROM payment GROUP BY customer_id) p1 USING (customer_id) ) x WHERE x.rank <= 3 ORDER BY x.store_id, x.rank;
I-16 Beyond Simple SQL SELECT * FROM (SELECT c1.first_name, c1.last_name, c1.store_id, p1.total, (SELECT 1 + count(*) FROM customer c2 JOIN (SELECT customer_id, sum(amount) AS total FROM payment GROUP BY customer_id) p2 USING (customer_id) WHERE c2.store_id = c1.store_id AND p2.total > p1.total ) AS rank FROM customer c1 JOIN (SELECT customer_id, sum(amount) AS total FROM payment GROUP BY customer_id) p1 USING (customer_id) ) x WHERE x.rank <= 3 ORDER BY x.store_id, x.rank;
J-0 Query Your Queries
J-1 Query Your Queries finding slow queries: log_min_duration_statement -1, 0 , n superuser only alter user LOG: duration: 5005.273 ms statement: select pg_sleep(5);
J-2 Query Your Queries finding slow queries: pgfouine / pqa log analyzers command line, generate reports i/o load http://pgfouine.projects.postgresql.org/reports.html http://pqa.projects.postgresql.org/example.html
J-3 Query Your Queries finding slow queries: pg_stat_all_tables pagila=# \d pg_stat_all_tables View &quot;pg_catalog.pg_stat_all_tables&quot; Column | Type | ----------------+-------------------------- relid | oid | schemaname | name | relname | name | seq_scan | bigint | seq_tup_read | bigint | idx_scan | bigint | idx_tup_fetch | bigint | n_tup_ins | bigint | n_tup_upd | bigint | n_tup_del | bigint |
J-4 Query Your Queries finding slow queries: pg_stat_all_tables pagila=# \d pg_stat_all_tables View &quot;pg_catalog.pg_stat_all_tables&quot; Column | Type | ----------------+-------------------------- relid | oid | schemaname | name | relname | name | seq_scan | bigint | seq_tup_read | bigint | idx_scan | bigint | idx_tup_fetch | bigint | n_tup_ins | bigint | n_tup_upd | bigint | n_tup_del | bigint |
J-5 Query Your Queries finding slow queries: pg_stat_all_tables pagila=# \d pg_stat_all_tables View &quot;pg_catalog.pg_stat_all_tables&quot; Column | Type | ----------------+-------------------------- relid | oid | schemaname | name | relname | name | seq_scan | bigint | seq_tup_read | bigint | idx_scan | bigint | idx_tup_fetch | bigint | n_tup_ins | bigint | n_tup_upd | bigint | n_tup_del | bigint |
J-6 Query Your Queries finding slow queries: pg_stat_all_tables pagila=# \d pg_stat_all_tables View &quot;pg_catalog.pg_stat_all_tables&quot; Column | Type | ----------------+-------------------------- relid | oid | schemaname | name | relname | name | seq_scan | bigint | seq_tup_read | bigint | idx_scan | bigint | idx_tup_fetch | bigint | n_tup_ins | bigint | n_tup_upd | bigint | n_tup_del | bigint |
J-7 Query Your Queries finding slow queries: pg_stat_all_indexes pagila=# \d pg_stat_all_indexes View &quot;pg_catalog.pg_stat_all_indexes&quot; Column | Type | ---------------+--------+ relid | oid | indexrelid | oid | schemaname | name | relname | name | indexrelname | name | idx_scan | bigint | idx_tup_read | bigint | idx_tup_fetch | bigint |
J-8 Query Your Queries finding slow queries: pg_stat_all_indexes pagila=# \d pg_stat_all_indexes View &quot;pg_catalog.pg_stat_all_indexes&quot; Column | Type | ---------------+--------+ relid | oid | indexrelid | oid | schemaname | name | relname | name | indexrelname | name | idx_scan | bigint | idx_tup_read | bigint | idx_tup_fetch | bigint |
J-9 Query Your Queries finding slow queries: pg_statio_all_tables pagila=# \d pg_statio_all_tables View &quot;pg_catalog.pg_statio_all_tables&quot; Column | Type | -----------------+--------+ relid | oid | schemaname | name | relname | name | heap_blks_read | bigint | heap_blks_hit | bigint | idx_blks_read | bigint | idx_blks_hit | bigint | toast_blks_read | bigint | toast_blks_hit | bigint | tidx_blks_read | bigint | tidx_blks_hit | bigint |
J-9 Query Your Queries finding slow queries: pg_statio_all_tables pagila=# \d pg_statio_all_tables View &quot;pg_catalog.pg_statio_all_tables&quot; Column | Type | -----------------+--------+ relid | oid | schemaname | name | relname | name | heap_blks_read | bigint | heap_blks_hit | bigint | idx_blks_read | bigint | idx_blks_hit | bigint | toast_blks_read | bigint | toast_blks_hit | bigint | tidx_blks_read | bigint | tidx_blks_hit | bigint |
J-10 Query Your Queries fixing slow queries: explain analyze universal tool good for specific queries “ explain” for large queries could be it's own talk
J-11 Query Your Queries fixing slow queries: explain analyze universal tool good for specific queries “ explain” for large queries could be it's own talk http://www.postgresql.org/docs/techdocs.38
L-0 Indexing Options
L-1 Indexing Options indexing (basic) use explain to find large sequential reads use pg_stat_* tables to find numerous reads btree – (gist/gin) enable_indexscan, enable_bitmapscan dual column vs. single column
L-2 Indexing Options indexing (partial) create index address_ba_part_idx on address (district) where district = 'Buenos Aires'; restrain index to rows that matter can give significant speed improvements where clause of index should match where clause of query
L-3 Indexing Options indexing (partial) create index customer_active_part_idx on customer (customer_id) where activebool is true; restrain index to rows that matter can give significant speed improvements where clause of index should match where clause of query
L-4 Indexing Options indexing (functional) some people prefer to call these expressional indexes
L-5 Indexing Options indexing (expressional) create unique index one_true_email_xidx on customer (lower(email)); push expensive functions into your index system sees just WHERE indexedcolumn = 'constant' expression of index should match expression of queries narrow scope, but nice gains
L-6 Indexing Options indexing (expressional) create index fullname_xidx on customer ((first_name||' '||last_name)); push expensive functions into your index system sees just WHERE indexedcolumn = 'constant' expression of index should match expression of queries narrow scope, but nice gains
L-7 Indexing Options full text indexing gist vs. gin old school slower for queries faster insert / update mature new in 8.2 faster for queries slower insert / update green
N-0 PostgreSQL Tablespaces
N-1 PostgreSQL Tablespaces tablespaces? define logical locations for object placement point to locations on disk (uses symlinks) size determined by disk size (not pre-ordained) dedicate per db, split db across multiple tblspc
N-2 PostgreSQL Tablespaces tablespaces! split database over separate disks use stat, statio tables to gauge disk access create dedicated storage for workloads disk for read / write disk for read only large, slow disk for archiving disk for indexes
Q-0 PostgreSQL Partitioning
Q-1 PostgreSQL Partitioning partitioning? as table size grows, it becomes unmanageable use inheritance, rules, constraints to split data queries ignore non-relevant partitions could be it's own talk
Q-2 PostgreSQL Partitioning partitioning! as table size grows, it becomes unmanageable use inheritance, rules, constraints to split data queries ignore non-relevant partitions could be it's own talk http://www.pgcon.org/2007/schedule/events/41.en.html
Q-3 PostgreSQL Partitioning partitioning : key points determine list vs. range use triggers rather than rules partition creation vs. data population automate maintenance
K-0 Other Stuff I Should Mention
K-1 Other Stuff I Should Mention tsearch2 full text search interface /contrib module (integrated in 8.3?) very advanced capabilities (rank, headline) missing some things (wildcard) good performance for db's better performance with problem specific tools make it live in its own schema
K-2 Other Stuff I Should Mention pgcrypto cryptography type functions /contrib (export issues) md5, sha1, blowfish, many more
K-3 Other Stuff I Should Mention dblink pg -> pg connections /contrib (still under development?) can have performance issues on large queries make it live in its own schema beware security issue
K-4 Other Stuff I Should Mention autonomous logging tool persistent logging for postgresql functions built on top of dblink make it live in its own schema https://labs.omniti.com/trac/pgsoltools
K-4 Other Stuff I Should Mention snapshot pitr clones full read/write copy of pitr slave static snapshot need solaris (zfs zone mojo) https://labs.omniti.com/trac/pgsoltools
K-4 Other Stuff I Should Mention pgbouncer connection pooling application 100-1 pgb/db connection ratio small community, good results https://developer.skype.com/SkypeGarage/DbProjects/PgBouncer
K-5 Other Stuff I Should Mention dbi-link heterogeneous connections for postgresql built on plperl / dbi moving toward sql-med support similar performance issues to dblink http://pgfoundry.org/projects/dbi-link/
K-6 Other Stuff I Should Mention phppgadmin web based gui for postgresql remote administration of multiple servers implements much of postgresql functionality support back to 7.1? http://phppgadmin.sourceforge.net/
K-7 Other Stuff I Should Mention ;-) my book?
K-8 Other Stuff I Should Mention ;-) we're hiring software engineers web programmers ui/graphics http://www.omniti.com/people/jobs
X-0 El Fin

Pro PostgreSQL

  • 1.
    Pro PostgreSQL RobertTreat omniti.com brighterlamp.org
  • 2.
    Who Am I? (Why Listen To Me) A-0 PostgreSQL User Since 6.5.x DBA of High Traffic / Large PostgreSQL Instances Long Time Contributor to PostgreSQL Project Contribute / Maintain Several Open Source Projects Co-Author Beginning PHP & PostgreSQL 8 (Apress)
  • 3.
    Outline A-1 InstallationUpgrading Configuration Routine maintenance Replication / Availability Advanced SQL Query tuning Indexing Tablespaces Partitioning http://pgfoundry.org/projects/dbsamples/
  • 4.
    C-0 Get OffTo A Good Start
  • 5.
    C-1 Get OffTo A Good Start Use package management Consistent Standardized Simple
  • 6.
    Different across systemsUpgrades are an issue Trust your packager? C-2 Get Off To A Good Start Use package management
  • 7.
    Different across systemsUpgrades are an issue Trust your packager? C-3 Get Off To A Good Start Use package management Don't Be Afraid To Roll Your Own
  • 8.
    C-4 Get OffTo A Good Start $PGDATA/pg_log /var/log/pgsql when in doubt... (postgresql.conf) Configure Logging Logging is often overlooked, but is the first step toward troubleshooting!
  • 9.
    C-5 Get OffTo A Good Start most systems have different defaults firewalls/ selinux (FATAL) rtfm (pg_hba.conf, grant, revoke) Configure Authentication
  • 10.
    C-6 Get OffTo A Good Start TRUST md5 IDENT Authentication Methods
  • 11.
    C-7 Get OffTo A Good Start trust these more than your own code package dependent use different schemas tsearch2, pgcrypto pgstatstuple, pg_buffercache, pg_freespacemap /contrib
  • 12.
    C-8 Get OffTo A Good Start package dependent some are non-core (plruby, plr, plphp) varying functionality varying levels of trust don't be afraid, test! procedural languages
  • 13.
    D-0 Let's TalkAbout Upgrades
  • 14.
    D-1 Let's TalkAbout Upgrades Versioning First Digit ( 7 .4.16 -> 8 .2.0) Second Digit (8.2.4 -> 8.3.0) Third Digit (8.3.0 -> 8.3.1)
  • 15.
    D-2 Let's TalkAbout Upgrades Versioning First Digit (7.4.16 -> 8.2.0) Second Digit (8. 2 .4 -> 8. 3 .0) Third Digit (8.3.0 -> 8.3.1)
  • 16.
    D-3 Let's TalkAbout Upgrades Versioning First Digit (7.4.16 -> 8.2.0) Second Digit (8.2.4 -> 8.3.0) Third Digit (8.3. 0 -> 8.3. 1 )
  • 17.
    D-4 Let's TalkAbout Upgrades Achtung!! Make Backups! Read the Release Notes!
  • 18.
    D-5 Let's TalkAbout Upgrades pg_dump/pg_restore simple -Fc is your friend dump with new version of pg_dump pitfalls (time, hdd)
  • 19.
    D-6 Let's TalkAbout Upgrades the slony method not simple create slave on new version switchover (switch back?) pitfalls (initial synch, compatibility)
  • 20.
    D-7 Let's TalkAbout Upgrades pg_migrator in place upgrades rewrites system catalog info no way to go back (fs snapshots) still new, getting better 8.1 -> 8.2 only (for now)
  • 21.
    D-8 Let's TalkAbout Upgrades upgrading older db <= 7.2 is no longer supported (upgrade now!) pg_dump 8.2 has issues with <= 7.2 you can upgrade to 7.3 first use adddepends on 7.3 install slony requires 7.3 (or 7.4) or newer pg_migrator (lol)
  • 22.
    E-0 Figure YourConfigure
  • 23.
    E-1 Figure YourConfigure the basics : performance effective_cache_size shared_buffers default_statistics_target sort_mem checkpoint_segments checkpoint_timeout
  • 24.
    E-2 Figure YourConfigure the basics : logging stderr/pg_log vs. syslog/eventlog log_min_error_statement (error!) log_min_duration_statement log_line_prefix (%d, %p, %t)
  • 25.
    E-5 Figure YourConfigure other stuff worth looking at maintenance_work_mem max_prepared_transactions update_process_title max_fsm_pages
  • 26.
  • 27.
    P-1 Routine Maintenancea word about vacuum reclaim usable space update table stats avoid xid wraparound
  • 28.
    P-2 Routine Maintenanceautovacuum : just do it! autovacuum stats_start_collector stats_row_level pg_autovacuum ?
  • 29.
    P-3 Routine Maintenanceother stuff worth looking at reindexing logfiles backups failover
  • 30.
  • 31.
    G-1 Availability whatdo we mean by availability? not backups (exactly) not replication (necessarily) not clustering (even less so)
  • 32.
    G-2 Availability whatdo we mean by availability? if (kablooy) then (ok) not backups (exactly) not replication (necessarily) not clustering (even less so)
  • 33.
    G-3 Availability pg_dumptraditionally used for backups send dump to another server constantly run restore process large time, i/o constraints
  • 34.
    G-4 Availability pitrcreate second, standby server ship wal logs to new server less time/io than pg_dump 8.1 -> cold standby 8.2 -> warm standby 8.4 -> hot standby ?
  • 35.
    G-5 Availability slonyasynchronous, master-slave replication controlled switchover, failover low i/o, time constraints other benefits (upgrades, scaling)
  • 36.
    G-5 Availability bucardoasynchronous, multi-master replication conflict resolution low i/o, time constraints other benefits (upgrades, scaling)
  • 37.
    G-6 Availability shareddisk one copy of PGDATA on shared storage standby takes over akin to db crash shared disk is point of failure (raid) must ensure only one postmaster running
  • 38.
    G-7 Availability filesystemreplication drbd, zfs filesystem mirrored between servers synchronized, ordered writes single disk system?
  • 39.
    G-8 Availability pgpooldual-master, statement based little caveats (random(),now(),sequences) bigger caveats (security, password, pg_hba) pgpool becomes failure point
  • 40.
  • 41.
    I-1 Beyond SimpleSQL generate series pagila=# select * from generate_series(1,3); generate_series ----------------- 1 2 3 (3 rows) behold the power of loops!
  • 42.
    I-2 Beyond SimpleSQL generate series extrapolate many uses pagila=# select '2007-05-22 09:00:00'::timestamp - x * '1 hour'::interval as countdown from generate_series(1,10) x; countdown --------------------- 2007-05-22 08:00:00 2007-05-22 07:00:00 2007-05-22 06:00:00 2007-05-22 05:00:00 2007-05-22 04:00:00 2007-05-22 03:00:00 2007-05-22 02:00:00 2007-05-22 01:00:00 2007-05-22 00:00:00 2007-05-21 23:00:00 (10 rows)
  • 43.
    I-3 Beyond SimpleSQL rownum()
  • 44.
    I-4 Beyond SimpleSQL row numbering
  • 45.
    I-5 Beyond SimpleSQL row numbering pagila=# select (select count(*)+1 from customer where customer_id < x.customer_id) as rownum, customer_id, first_name, last_name from customer x limit 5; rownum | customer_id | first_name | last_name --------+-------------+------------+----------- 1 | 1 | MARY | SMITH 2 | 3 | LINDA | WILLIAMS 3 | 4 | BARBARA | JONES 4 | 5 | ELIZABETH | BROWN 5 | 6 | JENNIFER | DAVIS (5 rows)
  • 46.
    I-6 Beyond SimpleSQL row numbering pagila=# select (select count(*)+1 from customer where customer_id < x.customer_id) as rownum, customer_id, first_name, last_name from customer x limit 5; rownum | customer_id | first_name | last_name --------+-------------+------------+----------- 1 | 1 | MARY | SMITH 2 | 3 | LINDA | WILLIAMS 3 | 4 | BARBARA | JONES 4 | 5 | ELIZABETH | BROWN 5 | 6 | JENNIFER | DAVIS (5 rows)
  • 47.
    I-7 Beyond SimpleSQL row numbering pagila=# select (select count(*)+1 from customer where customer_id < x.customer_id) as rownum, customer_id, first_name, last_name from customer x limit 5; rownum | customer_id | first_name | last_name --------+-------------+------------+----------- 1 | 1 | MARY | SMITH 2 | 3 | LINDA | WILLIAMS 3 | 4 | BARBARA | JONES 4 | 5 | ELIZABETH | BROWN 5 | 6 | JENNIFER | DAVIS (5 rows) fill in missing gaps
  • 48.
    I-8 Beyond SimpleSQL row numbering watch out for order by pagila=# select (select count(*)+1 from customer where last_name < x.last_name) as rownum, first_name, last_name from customer x order by last_name limit 5; rownum | first_name | last_name --------+------------+----------- 1 | RAFAEL | ABNEY 2 | NATHANIEL | ADAM 3 | KATHLEEN | ADAMS 4 | DIANA | ALEXANDER 5 | GORDON | ALLARD (5 rows)
  • 49.
    I-9 Beyond SimpleSQL rollup() pagila=# select customer_id, amount from payment limit 10; customer_id | amount -------------+-------- 267 | 7.98 267 | 0.00 269 | 3.98 269 | 0.00 274 | 0.99 279 | 4.99 282 | 0.99 284 | 5.98 284 | 0.00 287 | 0.99 (10 rows)
  • 50.
    I-10 Beyond SimpleSQL rollup() pagila=# select customer_id, amount, total from payment JOIN (select customer_id, sum(amount) as total from payment group by customer_id) x using (customer_id) limit 10; customer_id | amount | total -------------+--------+-------- 267 | 7.98 | 159.64 267 | 0.00 | 159.64 269 | 3.98 | 129.70 269 | 0.00 | 129.70 274 | 0.99 | 152.65 279 | 4.99 | 135.69 282 | 0.99 | 103.73 284 | 5.98 | 126.72 284 | 0.00 | 126.72 287 | 0.99 | 115.71 (10 rows) Include totals with given row
  • 51.
    I-11 Beyond SimpleSQL rank() SELECT * FROM (select c1.first_name, c1.last_name, c1.store_id, p1.total, (select 1 + count(*) from customer c2 join (select customer_id, sum(amount) as total from only payment group by customer_id) p2 using (customer_id) where c2.store_id = c1.store_id and p2.total > p1.total) as rank from customer c1 join (select customer_id, sum(amount) as total from only payment group by customer_id) p1 using (customer_id) ) x WHERE x.rank <= 3 ORDER BY x.store_id, x.rank; first_name | last_name | store_id | total | rank ------------+-----------+----------+--------+------ ELEANOR | HUNT | 1 | 216.54 | 1 CLARA | SHAW | 1 | 195.58 | 2 TOMMY | COLLAZO | 1 | 186.62 | 3 KARL | SEAL | 2 | 221.55 | 1 MARION | SNYDER | 2 | 194.61 | 2 RHONDA | KENNEDY | 2 | 194.61 | 2 (6 rows)
  • 52.
    I-12 Beyond SimpleSQL SELECT * FROM (SELECT c1.first_name, c1.last_name, c1.store_id, p1.total, (SELECT 1 + count(*) FROM customer c2 JOIN (SELECT customer_id, sum(amount) AS total FROM payment GROUP BY customer_id) p2 USING (customer_id) WHERE c2.store_id = c1.store_id AND p2.total > p1.total ) AS rank FROM customer c1 JOIN (SELECT customer_id, sum(amount) AS total FROM payment GROUP BY customer_id) p1 USING (customer_id) ) x WHERE x.rank <= 3 ORDER BY x.store_id, x.rank;
  • 53.
    I-13 Beyond SimpleSQL SELECT * FROM (SELECT c1.first_name, c1.last_name, c1.store_id, p1.total, (SELECT 1 + count(*) FROM customer c2 JOIN (SELECT customer_id, sum(amount) AS total FROM payment GROUP BY customer_id) p2 USING (customer_id) WHERE c2.store_id = c1.store_id AND p2.total > p1.total ) AS rank FROM customer c1 JOIN (SELECT customer_id, sum(amount) AS total FROM payment GROUP BY customer_id) p1 USING (customer_id) ) x WHERE x.rank <= 3 ORDER BY x.store_id, x.rank;
  • 54.
    I-14 Beyond SimpleSQL SELECT * FROM (SELECT c1.first_name, c1.last_name, c1.store_id, p1.total, (SELECT 1 + count(*) FROM customer c2 JOIN (SELECT customer_id, sum(amount) AS total FROM payment GROUP BY customer_id) p2 USING (customer_id) WHERE c2.store_id = c1.store_id AND p2.total > p1.total ) AS rank FROM customer c1 JOIN (SELECT customer_id, sum(amount) AS total FROM payment GROUP BY customer_id) p1 USING (customer_id) ) x WHERE x.rank <= 3 ORDER BY x.store_id, x.rank;
  • 55.
    I-15 Beyond SimpleSQL SELECT * FROM (SELECT c1.first_name, c1.last_name, c1.store_id, p1.total, (SELECT 1 + count(*) FROM customer c2 JOIN (SELECT customer_id, sum(amount) AS total FROM payment GROUP BY customer_id) p2 USING (customer_id) WHERE c2.store_id = c1.store_id AND p2.total > p1.total ) AS rank FROM customer c1 JOIN (SELECT customer_id, sum(amount) AS total FROM payment GROUP BY customer_id) p1 USING (customer_id) ) x WHERE x.rank <= 3 ORDER BY x.store_id, x.rank;
  • 56.
    I-16 Beyond SimpleSQL SELECT * FROM (SELECT c1.first_name, c1.last_name, c1.store_id, p1.total, (SELECT 1 + count(*) FROM customer c2 JOIN (SELECT customer_id, sum(amount) AS total FROM payment GROUP BY customer_id) p2 USING (customer_id) WHERE c2.store_id = c1.store_id AND p2.total > p1.total ) AS rank FROM customer c1 JOIN (SELECT customer_id, sum(amount) AS total FROM payment GROUP BY customer_id) p1 USING (customer_id) ) x WHERE x.rank <= 3 ORDER BY x.store_id, x.rank;
  • 57.
  • 58.
    J-1 Query YourQueries finding slow queries: log_min_duration_statement -1, 0 , n superuser only alter user LOG: duration: 5005.273 ms statement: select pg_sleep(5);
  • 59.
    J-2 Query YourQueries finding slow queries: pgfouine / pqa log analyzers command line, generate reports i/o load http://pgfouine.projects.postgresql.org/reports.html http://pqa.projects.postgresql.org/example.html
  • 60.
    J-3 Query YourQueries finding slow queries: pg_stat_all_tables pagila=# \d pg_stat_all_tables View &quot;pg_catalog.pg_stat_all_tables&quot; Column | Type | ----------------+-------------------------- relid | oid | schemaname | name | relname | name | seq_scan | bigint | seq_tup_read | bigint | idx_scan | bigint | idx_tup_fetch | bigint | n_tup_ins | bigint | n_tup_upd | bigint | n_tup_del | bigint |
  • 61.
    J-4 Query YourQueries finding slow queries: pg_stat_all_tables pagila=# \d pg_stat_all_tables View &quot;pg_catalog.pg_stat_all_tables&quot; Column | Type | ----------------+-------------------------- relid | oid | schemaname | name | relname | name | seq_scan | bigint | seq_tup_read | bigint | idx_scan | bigint | idx_tup_fetch | bigint | n_tup_ins | bigint | n_tup_upd | bigint | n_tup_del | bigint |
  • 62.
    J-5 Query YourQueries finding slow queries: pg_stat_all_tables pagila=# \d pg_stat_all_tables View &quot;pg_catalog.pg_stat_all_tables&quot; Column | Type | ----------------+-------------------------- relid | oid | schemaname | name | relname | name | seq_scan | bigint | seq_tup_read | bigint | idx_scan | bigint | idx_tup_fetch | bigint | n_tup_ins | bigint | n_tup_upd | bigint | n_tup_del | bigint |
  • 63.
    J-6 Query YourQueries finding slow queries: pg_stat_all_tables pagila=# \d pg_stat_all_tables View &quot;pg_catalog.pg_stat_all_tables&quot; Column | Type | ----------------+-------------------------- relid | oid | schemaname | name | relname | name | seq_scan | bigint | seq_tup_read | bigint | idx_scan | bigint | idx_tup_fetch | bigint | n_tup_ins | bigint | n_tup_upd | bigint | n_tup_del | bigint |
  • 64.
    J-7 Query YourQueries finding slow queries: pg_stat_all_indexes pagila=# \d pg_stat_all_indexes View &quot;pg_catalog.pg_stat_all_indexes&quot; Column | Type | ---------------+--------+ relid | oid | indexrelid | oid | schemaname | name | relname | name | indexrelname | name | idx_scan | bigint | idx_tup_read | bigint | idx_tup_fetch | bigint |
  • 65.
    J-8 Query YourQueries finding slow queries: pg_stat_all_indexes pagila=# \d pg_stat_all_indexes View &quot;pg_catalog.pg_stat_all_indexes&quot; Column | Type | ---------------+--------+ relid | oid | indexrelid | oid | schemaname | name | relname | name | indexrelname | name | idx_scan | bigint | idx_tup_read | bigint | idx_tup_fetch | bigint |
  • 66.
    J-9 Query YourQueries finding slow queries: pg_statio_all_tables pagila=# \d pg_statio_all_tables View &quot;pg_catalog.pg_statio_all_tables&quot; Column | Type | -----------------+--------+ relid | oid | schemaname | name | relname | name | heap_blks_read | bigint | heap_blks_hit | bigint | idx_blks_read | bigint | idx_blks_hit | bigint | toast_blks_read | bigint | toast_blks_hit | bigint | tidx_blks_read | bigint | tidx_blks_hit | bigint |
  • 67.
    J-9 Query YourQueries finding slow queries: pg_statio_all_tables pagila=# \d pg_statio_all_tables View &quot;pg_catalog.pg_statio_all_tables&quot; Column | Type | -----------------+--------+ relid | oid | schemaname | name | relname | name | heap_blks_read | bigint | heap_blks_hit | bigint | idx_blks_read | bigint | idx_blks_hit | bigint | toast_blks_read | bigint | toast_blks_hit | bigint | tidx_blks_read | bigint | tidx_blks_hit | bigint |
  • 68.
    J-10 Query YourQueries fixing slow queries: explain analyze universal tool good for specific queries “ explain” for large queries could be it's own talk
  • 69.
    J-11 Query YourQueries fixing slow queries: explain analyze universal tool good for specific queries “ explain” for large queries could be it's own talk http://www.postgresql.org/docs/techdocs.38
  • 70.
  • 71.
    L-1 Indexing Optionsindexing (basic) use explain to find large sequential reads use pg_stat_* tables to find numerous reads btree – (gist/gin) enable_indexscan, enable_bitmapscan dual column vs. single column
  • 72.
    L-2 Indexing Optionsindexing (partial) create index address_ba_part_idx on address (district) where district = 'Buenos Aires'; restrain index to rows that matter can give significant speed improvements where clause of index should match where clause of query
  • 73.
    L-3 Indexing Optionsindexing (partial) create index customer_active_part_idx on customer (customer_id) where activebool is true; restrain index to rows that matter can give significant speed improvements where clause of index should match where clause of query
  • 74.
    L-4 Indexing Optionsindexing (functional) some people prefer to call these expressional indexes
  • 75.
    L-5 Indexing Optionsindexing (expressional) create unique index one_true_email_xidx on customer (lower(email)); push expensive functions into your index system sees just WHERE indexedcolumn = 'constant' expression of index should match expression of queries narrow scope, but nice gains
  • 76.
    L-6 Indexing Optionsindexing (expressional) create index fullname_xidx on customer ((first_name||' '||last_name)); push expensive functions into your index system sees just WHERE indexedcolumn = 'constant' expression of index should match expression of queries narrow scope, but nice gains
  • 77.
    L-7 Indexing Optionsfull text indexing gist vs. gin old school slower for queries faster insert / update mature new in 8.2 faster for queries slower insert / update green
  • 78.
  • 79.
    N-1 PostgreSQL Tablespacestablespaces? define logical locations for object placement point to locations on disk (uses symlinks) size determined by disk size (not pre-ordained) dedicate per db, split db across multiple tblspc
  • 80.
    N-2 PostgreSQL Tablespacestablespaces! split database over separate disks use stat, statio tables to gauge disk access create dedicated storage for workloads disk for read / write disk for read only large, slow disk for archiving disk for indexes
  • 81.
  • 82.
    Q-1 PostgreSQL Partitioningpartitioning? as table size grows, it becomes unmanageable use inheritance, rules, constraints to split data queries ignore non-relevant partitions could be it's own talk
  • 83.
    Q-2 PostgreSQL Partitioningpartitioning! as table size grows, it becomes unmanageable use inheritance, rules, constraints to split data queries ignore non-relevant partitions could be it's own talk http://www.pgcon.org/2007/schedule/events/41.en.html
  • 84.
    Q-3 PostgreSQL Partitioningpartitioning : key points determine list vs. range use triggers rather than rules partition creation vs. data population automate maintenance
  • 85.
    K-0 Other StuffI Should Mention
  • 86.
    K-1 Other StuffI Should Mention tsearch2 full text search interface /contrib module (integrated in 8.3?) very advanced capabilities (rank, headline) missing some things (wildcard) good performance for db's better performance with problem specific tools make it live in its own schema
  • 87.
    K-2 Other StuffI Should Mention pgcrypto cryptography type functions /contrib (export issues) md5, sha1, blowfish, many more
  • 88.
    K-3 Other StuffI Should Mention dblink pg -> pg connections /contrib (still under development?) can have performance issues on large queries make it live in its own schema beware security issue
  • 89.
    K-4 Other StuffI Should Mention autonomous logging tool persistent logging for postgresql functions built on top of dblink make it live in its own schema https://labs.omniti.com/trac/pgsoltools
  • 90.
    K-4 Other StuffI Should Mention snapshot pitr clones full read/write copy of pitr slave static snapshot need solaris (zfs zone mojo) https://labs.omniti.com/trac/pgsoltools
  • 91.
    K-4 Other StuffI Should Mention pgbouncer connection pooling application 100-1 pgb/db connection ratio small community, good results https://developer.skype.com/SkypeGarage/DbProjects/PgBouncer
  • 92.
    K-5 Other StuffI Should Mention dbi-link heterogeneous connections for postgresql built on plperl / dbi moving toward sql-med support similar performance issues to dblink http://pgfoundry.org/projects/dbi-link/
  • 93.
    K-6 Other StuffI Should Mention phppgadmin web based gui for postgresql remote administration of multiple servers implements much of postgresql functionality support back to 7.1? http://phppgadmin.sourceforge.net/
  • 94.
    K-7 Other StuffI Should Mention ;-) my book?
  • 95.
    K-8 Other StuffI Should Mention ;-) we're hiring software engineers web programmers ui/graphics http://www.omniti.com/people/jobs
  • 96.