Pro PostgreSQL

Pro PostgreSQL Robert Treat omniti.com brighterlamp.org

Who Am I? (Why Listen To Me) A-0 PostgreSQL User Since 6.5.x DBA of High Traffic / Large PostgreSQL Instances Long Time Contributor to PostgreSQL Project Contribute / Maintain Several Open Source Projects Co-Author Beginning PHP & PostgreSQL 8 (Apress)

Outline A-1 Installation Upgrading Configuration Routine maintenance Replication / Availability Advanced SQL Query tuning Indexing Tablespaces Partitioning http://pgfoundry.org/projects/dbsamples/

C-1 Get Off To A Good Start Use package management Consistent Standardized Simple

Different across systems Upgrades are an issue Trust your packager? C-2 Get Off To A Good Start Use package management

Different across systems Upgrades are an issue Trust your packager? C-3 Get Off To A Good Start Use package management Don't Be Afraid To Roll Your Own

C-4 Get Off To A Good Start $PGDATA/pg_log /var/log/pgsql when in doubt... (postgresql.conf) Configure Logging Logging is often overlooked, but is the first step toward troubleshooting!

C-5 Get Off To A Good Start most systems have different defaults firewalls/ selinux (FATAL) rtfm (pg_hba.conf, grant, revoke) Configure Authentication

C-6 Get Off To A Good Start TRUST md5 IDENT Authentication Methods

C-7 Get Off To A Good Start trust these more than your own code package dependent use different schemas tsearch2, pgcrypto pgstatstuple, pg_buffercache, pg_freespacemap /contrib

C-8 Get Off To A Good Start package dependent some are non-core (plruby, plr, plphp) varying functionality varying levels of trust don't be afraid, test! procedural languages

D-1 Let's Talk About Upgrades Versioning First Digit ( 7 .4.16 -> 8 .2.0) Second Digit (8.2.4 -> 8.3.0) Third Digit (8.3.0 -> 8.3.1)

D-2 Let's Talk About Upgrades Versioning First Digit (7.4.16 -> 8.2.0) Second Digit (8. 2 .4 -> 8. 3 .0) Third Digit (8.3.0 -> 8.3.1)

D-3 Let's Talk About Upgrades Versioning First Digit (7.4.16 -> 8.2.0) Second Digit (8.2.4 -> 8.3.0) Third Digit (8.3. 0 -> 8.3. 1 )

D-4 Let's Talk About Upgrades Achtung!! Make Backups! Read the Release Notes!

D-5 Let's Talk About Upgrades pg_dump/pg_restore simple -Fc is your friend dump with new version of pg_dump pitfalls (time, hdd)

D-6 Let's Talk About Upgrades the slony method not simple create slave on new version switchover (switch back?) pitfalls (initial synch, compatibility)

D-7 Let's Talk About Upgrades pg_migrator in place upgrades rewrites system catalog info no way to go back (fs snapshots) still new, getting better 8.1 -> 8.2 only (for now)

D-8 Let's Talk About Upgrades upgrading older db <= 7.2 is no longer supported (upgrade now!) pg_dump 8.2 has issues with <= 7.2 you can upgrade to 7.3 first use adddepends on 7.3 install slony requires 7.3 (or 7.4) or newer pg_migrator (lol)

E-1 Figure Your Configure the basics : performance effective_cache_size shared_buffers default_statistics_target sort_mem checkpoint_segments checkpoint_timeout

E-2 Figure Your Configure the basics : logging stderr/pg_log vs. syslog/eventlog log_min_error_statement (error!) log_min_duration_statement log_line_prefix (%d, %p, %t)

E-5 Figure Your Configure other stuff worth looking at maintenance_work_mem max_prepared_transactions update_process_title max_fsm_pages

P-1 Routine Maintenance a word about vacuum reclaim usable space update table stats avoid xid wraparound

P-2 Routine Maintenance autovacuum : just do it! autovacuum stats_start_collector stats_row_level pg_autovacuum ?

P-3 Routine Maintenance other stuff worth looking at reindexing logfiles backups failover

G-1 Availability what do we mean by availability? not backups (exactly) not replication (necessarily) not clustering (even less so)

G-2 Availability what do we mean by availability? if (kablooy) then (ok) not backups (exactly) not replication (necessarily) not clustering (even less so)

G-3 Availability pg_dump traditionally used for backups send dump to another server constantly run restore process large time, i/o constraints

G-4 Availability pitr create second, standby server ship wal logs to new server less time/io than pg_dump 8.1 -> cold standby 8.2 -> warm standby 8.4 -> hot standby ?

G-5 Availability slony asynchronous, master-slave replication controlled switchover, failover low i/o, time constraints other benefits (upgrades, scaling)

G-5 Availability bucardo asynchronous, multi-master replication conflict resolution low i/o, time constraints other benefits (upgrades, scaling)

G-6 Availability shared disk one copy of PGDATA on shared storage standby takes over akin to db crash shared disk is point of failure (raid) must ensure only one postmaster running

G-7 Availability filesystem replication drbd, zfs filesystem mirrored between servers synchronized, ordered writes single disk system?

G-8 Availability pgpool dual-master, statement based little caveats (random(),now(),sequences) bigger caveats (security, password, pg_hba) pgpool becomes failure point

I-1 Beyond Simple SQL generate series pagila=# select * from generate_series(1,3); generate_series ----------------- 1 2 3 (3 rows) behold the power of loops!

I-2 Beyond Simple SQL generate series extrapolate many uses pagila=# select '2007-05-22 09:00:00'::timestamp - x * '1 hour'::interval as countdown from generate_series(1,10) x; countdown --------------------- 2007-05-22 08:00:00 2007-05-22 07:00:00 2007-05-22 06:00:00 2007-05-22 05:00:00 2007-05-22 04:00:00 2007-05-22 03:00:00 2007-05-22 02:00:00 2007-05-22 01:00:00 2007-05-22 00:00:00 2007-05-21 23:00:00 (10 rows)

I-3 Beyond Simple SQL rownum()

I-4 Beyond Simple SQL row numbering

I-9 Beyond Simple SQL rollup() pagila=# select customer_id, amount from payment limit 10; customer_id | amount -------------+-------- 267 | 7.98 267 | 0.00 269 | 3.98 269 | 0.00 274 | 0.99 279 | 4.99 282 | 0.99 284 | 5.98 284 | 0.00 287 | 0.99 (10 rows)

I-10 Beyond Simple SQL rollup() pagila=# select customer_id, amount, total from payment JOIN (select customer_id, sum(amount) as total from payment group by customer_id) x using (customer_id) limit 10; customer_id | amount | total -------------+--------+-------- 267 | 7.98 | 159.64 267 | 0.00 | 159.64 269 | 3.98 | 129.70 269 | 0.00 | 129.70 274 | 0.99 | 152.65 279 | 4.99 | 135.69 282 | 0.99 | 103.73 284 | 5.98 | 126.72 284 | 0.00 | 126.72 287 | 0.99 | 115.71 (10 rows) Include totals with given row

I-11 Beyond Simple SQL rank() SELECT * FROM (select c1.first_name, c1.last_name, c1.store_id, p1.total, (select 1 + count(*) from customer c2 join (select customer_id, sum(amount) as total from only payment group by customer_id) p2 using (customer_id) where c2.store_id = c1.store_id and p2.total > p1.total) as rank from customer c1 join (select customer_id, sum(amount) as total from only payment group by customer_id) p1 using (customer_id) ) x WHERE x.rank <= 3 ORDER BY x.store_id, x.rank; first_name | last_name | store_id | total | rank ------------+-----------+----------+--------+------ ELEANOR | HUNT | 1 | 216.54 | 1 CLARA | SHAW | 1 | 195.58 | 2 TOMMY | COLLAZO | 1 | 186.62 | 3 KARL | SEAL | 2 | 221.55 | 1 MARION | SNYDER | 2 | 194.61 | 2 RHONDA | KENNEDY | 2 | 194.61 | 2 (6 rows)

I-12 Beyond Simple SQL SELECT * FROM (SELECT c1.first_name, c1.last_name, c1.store_id, p1.total, (SELECT 1 + count(*) FROM customer c2 JOIN (SELECT customer_id, sum(amount) AS total FROM payment GROUP BY customer_id) p2 USING (customer_id) WHERE c2.store_id = c1.store_id AND p2.total > p1.total ) AS rank FROM customer c1 JOIN (SELECT customer_id, sum(amount) AS total FROM payment GROUP BY customer_id) p1 USING (customer_id) ) x WHERE x.rank <= 3 ORDER BY x.store_id, x.rank;

J-1 Query Your Queries finding slow queries: log_min_duration_statement -1, 0 , n superuser only alter user LOG: duration: 5005.273 ms statement: select pg_sleep(5);

J-2 Query Your Queries finding slow queries: pgfouine / pqa log analyzers command line, generate reports i/o load http://pgfouine.projects.postgresql.org/reports.html http://pqa.projects.postgresql.org/example.html

J-3 Query Your Queries finding slow queries: pg_stat_all_tables pagila=# \d pg_stat_all_tables View "pg_catalog.pg_stat_all_tables" Column | Type | ----------------+-------------------------- relid | oid | schemaname | name | relname | name | seq_scan | bigint | seq_tup_read | bigint | idx_scan | bigint | idx_tup_fetch | bigint | n_tup_ins | bigint | n_tup_upd | bigint | n_tup_del | bigint |

J-7 Query Your Queries finding slow queries: pg_stat_all_indexes pagila=# \d pg_stat_all_indexes View "pg_catalog.pg_stat_all_indexes" Column | Type | ---------------+--------+ relid | oid | indexrelid | oid | schemaname | name | relname | name | indexrelname | name | idx_scan | bigint | idx_tup_read | bigint | idx_tup_fetch | bigint |

J-8 Query Your Queries finding slow queries: pg_stat_all_indexes pagila=# \d pg_stat_all_indexes View "pg_catalog.pg_stat_all_indexes" Column | Type | ---------------+--------+ relid | oid | indexrelid | oid | schemaname | name | relname | name | indexrelname | name | idx_scan | bigint | idx_tup_read | bigint | idx_tup_fetch | bigint |

J-9 Query Your Queries finding slow queries: pg_statio_all_tables pagila=# \d pg_statio_all_tables View "pg_catalog.pg_statio_all_tables" Column | Type | -----------------+--------+ relid | oid | schemaname | name | relname | name | heap_blks_read | bigint | heap_blks_hit | bigint | idx_blks_read | bigint | idx_blks_hit | bigint | toast_blks_read | bigint | toast_blks_hit | bigint | tidx_blks_read | bigint | tidx_blks_hit | bigint |

J-10 Query Your Queries fixing slow queries: explain analyze universal tool good for specific queries “ explain” for large queries could be it's own talk

J-11 Query Your Queries fixing slow queries: explain analyze universal tool good for specific queries “ explain” for large queries could be it's own talk http://www.postgresql.org/docs/techdocs.38

L-1 Indexing Options indexing (basic) use explain to find large sequential reads use pg_stat_* tables to find numerous reads btree – (gist/gin) enable_indexscan, enable_bitmapscan dual column vs. single column

L-2 Indexing Options indexing (partial) create index address_ba_part_idx on address (district) where district = 'Buenos Aires'; restrain index to rows that matter can give significant speed improvements where clause of index should match where clause of query

L-3 Indexing Options indexing (partial) create index customer_active_part_idx on customer (customer_id) where activebool is true; restrain index to rows that matter can give significant speed improvements where clause of index should match where clause of query

L-4 Indexing Options indexing (functional) some people prefer to call these expressional indexes

L-5 Indexing Options indexing (expressional) create unique index one_true_email_xidx on customer (lower(email)); push expensive functions into your index system sees just WHERE indexedcolumn = 'constant' expression of index should match expression of queries narrow scope, but nice gains

L-6 Indexing Options indexing (expressional) create index fullname_xidx on customer ((first_name||' '||last_name)); push expensive functions into your index system sees just WHERE indexedcolumn = 'constant' expression of index should match expression of queries narrow scope, but nice gains

L-7 Indexing Options full text indexing gist vs. gin old school slower for queries faster insert / update mature new in 8.2 faster for queries slower insert / update green

N-1 PostgreSQL Tablespaces tablespaces? define logical locations for object placement point to locations on disk (uses symlinks) size determined by disk size (not pre-ordained) dedicate per db, split db across multiple tblspc

N-2 PostgreSQL Tablespaces tablespaces! split database over separate disks use stat, statio tables to gauge disk access create dedicated storage for workloads disk for read / write disk for read only large, slow disk for archiving disk for indexes

Q-1 PostgreSQL Partitioning partitioning? as table size grows, it becomes unmanageable use inheritance, rules, constraints to split data queries ignore non-relevant partitions could be it's own talk

Q-2 PostgreSQL Partitioning partitioning! as table size grows, it becomes unmanageable use inheritance, rules, constraints to split data queries ignore non-relevant partitions could be it's own talk http://www.pgcon.org/2007/schedule/events/41.en.html

Q-3 PostgreSQL Partitioning partitioning : key points determine list vs. range use triggers rather than rules partition creation vs. data population automate maintenance

K-0 Other Stuff I Should Mention

K-1 Other Stuff I Should Mention tsearch2 full text search interface /contrib module (integrated in 8.3?) very advanced capabilities (rank, headline) missing some things (wildcard) good performance for db's better performance with problem specific tools make it live in its own schema

K-2 Other Stuff I Should Mention pgcrypto cryptography type functions /contrib (export issues) md5, sha1, blowfish, many more

K-3 Other Stuff I Should Mention dblink pg -> pg connections /contrib (still under development?) can have performance issues on large queries make it live in its own schema beware security issue

K-4 Other Stuff I Should Mention autonomous logging tool persistent logging for postgresql functions built on top of dblink make it live in its own schema https://labs.omniti.com/trac/pgsoltools

K-4 Other Stuff I Should Mention snapshot pitr clones full read/write copy of pitr slave static snapshot need solaris (zfs zone mojo) https://labs.omniti.com/trac/pgsoltools

K-4 Other Stuff I Should Mention pgbouncer connection pooling application 100-1 pgb/db connection ratio small community, good results https://developer.skype.com/SkypeGarage/DbProjects/PgBouncer

K-5 Other Stuff I Should Mention dbi-link heterogeneous connections for postgresql built on plperl / dbi moving toward sql-med support similar performance issues to dblink http://pgfoundry.org/projects/dbi-link/

K-6 Other Stuff I Should Mention phppgadmin web based gui for postgresql remote administration of multiple servers implements much of postgresql functionality support back to 7.1? http://phppgadmin.sourceforge.net/

K-7 Other Stuff I Should Mention ;-) my book?

K-8 Other Stuff I Should Mention ;-) we're hiring software engineers web programmers ui/graphics http://www.omniti.com/people/jobs

Pro PostgreSQL

More Related Content

What's hot

Viewers also liked

Similar to Pro PostgreSQL

More from Robert Treat

Recently uploaded

Pro PostgreSQL