A PostgreSQL DBAs Toolbelt for 2018 Kaarel Moppel / PGConf APAC 2018 24.03.2018 Kaarel Moppel / PGConf APAC 2018 24.03.2018
Disclaimer ▶ Slide contents freely usable/publishable under Creative Commons licence ▶ All views are my personal subjective views Kaarel Moppel / PGConf APAC 2018 24.03.2018
Fields of interest for DBAs ▶ Setting up Postgres ▶ Monitoring ▶ Daily DBA tasks ▶ Navigating the DB-land ▶ Data integration ▶ Developing ▶ HA ▶ Scaling Kaarel Moppel / PGConf APAC 2018 24.03.2018
Setting up Postgres Kaarel Moppel / PGConf APAC 2018 24.03.2018
First stop before any tools Tools are great but understanding the concepts is still more important. Means provided by the Postgres project should be quite sufficient in half of the cases. ▶ Official documentation ▶ Mailing lists ▶ wiki.postgresql.org Kaarel Moppel / PGConf APAC 2018 24.03.2018
pgtune Reads postgresql.conf file and outputs a new, recommended configuration based on your hardware and workload type (Web, OLTP, DW, mixed). ▶ Online version available - http://pgtune.leopard.in.ua/ ▶ New kid on the block - http://pgconfigurator.cybertec.at/ More hints about tuning: ▶ https://wiki.postgresql.org/wiki/Performance_Optimization Kaarel Moppel / PGConf APAC 2018 24.03.2018
pgbench “Standard” tool for quick hardware capability benchmarking. Bundeled with server binaries. ▶ Measures throughput in terms of TPS ▶ TPC-B like simple banking schema ▶ Custom workloads possible ▶ Foreign keys for more realistic numbers ▶ Play with fillfactor More: ▶ https://wiki.postgresql.org/wiki/Category:Benchmarking Kaarel Moppel / PGConf APAC 2018 24.03.2018
Backups ▶ pg_dump in parallel mode ▶ wal-e - cloud oriented ▶ wal-r - wal-e improved ▶ pgbarman - incremental backup, continous WAL streaming ▶ pgbackrest - incremental / differential backups, geared for speed and >TB DBs ▶ pitrery - simple PITR Kaarel Moppel / PGConf APAC 2018 24.03.2018
Monitoring (ad hoc) Kaarel Moppel / PGConf APAC 2018 24.03.2018
Unix/Linux tooling Get acquainted with some as not all are available on all platforms https://wiki.postgresql.org/wiki/Performance_Analysis_Tools ▶ top / htop ▶ iotop / dstat Kaarel Moppel / PGConf APAC 2018 24.03.2018
pg_activity A top like application for PostgreSQL server activity monitoring. Works both locally and remotely. ▶ Blocked/blocking queries view ▶ Terminating backends possible ▶ Some resource usage info available Kaarel Moppel / PGConf APAC 2018 24.03.2018
pg_view Activity view utility with very detailed system information. ▶ Mainly for “on the server” usage ▶ Linux only, but very lightweight ▶ Out-of-disk estimate alert ▶ Different output formats ▶ Detailed resource usage info available Kaarel Moppel / PGConf APAC 2018 24.03.2018
pgcenter Command line monitoring/management tool that does almost everything. ▶ “top” features ▶ pg_stat_* monitoring ▶ session termination ▶ server config editing ▶ logfile access Kaarel Moppel / PGConf APAC 2018 24.03.2018
pg_stat_statements A “must have” for quick problem detection. Enables ordering of often executed queries by avg. runtime, total time, blocks written/read, IO time. ▶ Contrib extension ▶ Minor performance penalty ▶ Anonymized query texts (mostly) Kaarel Moppel / PGConf APAC 2018 24.03.2018
Monitoring frameworks Kaarel Moppel / PGConf APAC 2018 24.03.2018
Generic monitoring frameworks Plugins to generic monitoring frameworks and APM providers - Nagios, Icinga, Munin, AppDynamics, New Relic, Datadog, etc… ▶ Some monitoring comes built in with cloud providers like AWS RDS and Heroku ▶ Most commercial providers have black-box approach Kaarel Moppel / PGConf APAC 2018 24.03.2018
check_postgres A Perl script with around 50 checks (actions) to answer most important questions around Postgres - is DB answering, how many sessions, longest/idle tx time, last checkpoint, bloat percentage, etc. ▶ Integrates with Nagios and some other general systems monitoring frameworks ▶ Knows quite a bit about Postgres ecosystem (PgBouncer, PgAgent, Slony) ▶ Can be well used from command line for ad hoc stuff Kaarel Moppel / PGConf APAC 2018 24.03.2018
check_pgactivity Nagios plugin. Similar to check_postgres but with less checks (services). ▶ Actively maintained Kaarel Moppel / PGConf APAC 2018 24.03.2018
PoWa Integrated data collection in server + external webapp ▶ pg_stat_statements + some custom data gatherers ▶ index suggestions! ▶ Not so light / dev. friendly - needs some extensions and setup Kaarel Moppel / PGConf APAC 2018 24.03.2018
pgwatch2 A new tool from Cybertec with minimalistic but flexible approach, relies heavily on state-of-the-art components like Grafana and InfluxDB. ▶ Almost all of pg_stat_* + some custom data gatherers ▶ User extensible metrics collection with SQL ▶ Developer friendly - no extensions or special rights, Docker first Kaarel Moppel / PGConf APAC 2018 24.03.2018
pghero Health checks, suggested indexes, and more ▶ Seems to be the most popular Postgres monitoring project on Github ▶ Thresholds and limits hardcoded ▶ Docker for quick testing Kaarel Moppel / PGConf APAC 2018 24.03.2018
Tools for common tasks Kaarel Moppel / PGConf APAC 2018 24.03.2018
Snippets There’s a ton of snippets from Postgresql Wiki to help with daily work. Might not always be 100% up to date but mostly nicely commented. ▶ Deleting duplicates ▶ Fixing sequences ▶ Finding / deleting dependencies ▶ Invalid / unused / duplicate indexes ▶ Unindexed foreign keys ▶ Bloat Kaarel Moppel / PGConf APAC 2018 24.03.2018
Other script-sets Activity, biggest tables, indexing problems, bloat, locking, foreign keys, … ▶ https://github.com/pgexperts/pgx_scripts ▶ https://github.com/dataegret/pg-utils Kaarel Moppel / PGConf APAC 2018 24.03.2018
Log analyzing Kaarel Moppel / PGConf APAC 2018 24.03.2018
DIY with standard Linux tools ▶ grep ▶ awk ▶ rsync Kaarel Moppel / PGConf APAC 2018 24.03.2018
pgBadger “de facto” log analyzer for PostgreSQL producing nice HTML pages with graphs. Dozens and dozens of options. ▶ Slowest/most frequent/most time consuming queries ▶ Sessions / locking / temp files ▶ TPS / query type distribution ▶ parallel/incremental processing ▶ PgBouncer logfiles support Kaarel Moppel / PGConf APAC 2018 24.03.2018
Other approaches ▶ Cloud provider plugins - Loggly, Splunk ▶ redislog extension for the ELK stack ▶ Importing directly into Postgres tables with COPY (easiest with CVS format) ▶ Exposing logfiles via file_fdw Kaarel Moppel / PGConf APAC 2018 24.03.2018
Indexing & bloat Kaarel Moppel / PGConf APAC 2018 24.03.2018
Indexing ▶ Snippetes from wiki.postgresql.org ▶ hypopg - extension for building “hypothetical” indexes ▶ pghero - index suggestions. Uses query parser from https://github.com/lfittl/pg_query ▶ pg_qualstats Kaarel Moppel / PGConf APAC 2018 24.03.2018
Bloat Kaarel Moppel / PGConf APAC 2018 24.03.2018
Spotting bloat ▶ “pgstattuple” contrib extension ▶ SQL Snippets from wiki.postgresql.org ▶ check_pgactivity - https://github.com/OPMDG/check_pgactivity ▶ Somekind of indicators from most monitoring tools Kaarel Moppel / PGConf APAC 2018 24.03.2018
Removing bloat ▶ VACUUM FULL / CLUSTER ▶ pg_repack ▶ pg_squeeze ▶ pgcompacttable Kaarel Moppel / PGConf APAC 2018 24.03.2018
Working with data Kaarel Moppel / PGConf APAC 2018 24.03.2018
Navigating Postgres DBs ▶ psql ▶ .psqlrc (per version / DB) ▶ variable interpolation ▶ crosstab ▶ if/else (PG10) ▶ https://github.com/okbob/pspg ▶ libpq service files (~/.pg_service.conf) ▶ psql_switch ▶ pgcli ▶ widely popular and actively developed ▶ code completes also functions! ▶ chokes noticably on big datasets Kaarel Moppel / PGConf APAC 2018 24.03.2018
Navigating Postgres DBs graphically ▶ pgadmin3/4 ▶ DataGrip ▶ pgModeler Kaarel Moppel / PGConf APAC 2018 24.03.2018
Making data available ▶ redash - visualize and share your data ▶ SqlPad ▶ save and share queries + basic charting ▶ RESTful Postgres ▶ https://github.com/begriffs/postgrest Kaarel Moppel / PGConf APAC 2018 24.03.2018
Queries over many Postgres DBs ▶ dblink ▶ postgres_fdw ▶ PrestoDB Kaarel Moppel / PGConf APAC 2018 24.03.2018
Data integrations ▶ COPY ▶ file_fdw (“program” option in PG10) ▶ Foreign Data Wrappers ▶ Your proprietary database / NoSQL store is most probably already supported ▶ https://wiki.postgresql.org/wiki/Foreign_data_wrappers ▶ pgloader - 1-click MySQL migrations, CSV Kaarel Moppel / PGConf APAC 2018 24.03.2018
Helper tools for developers ▶ https://explain.depesz.com/ ▶ sqlparse - A non-validating SQL parser module for Python. Supports plpgsql, tokenizing, extracting single statements, formatting with customizations. ▶ pgFormatter - https://github.com/darold/pgFormatter ▶ audit-trigger ▶ pgaudit ▶ pg_partman* Kaarel Moppel / PGConf APAC 2018 24.03.2018
Helper tools for developers ▶ plpgsql_check - extra validations for your sprocs ▶ plprofiler - per line code coverage and performance info ▶ pldebugger - step through code visually in PgAdmin ▶ pgTap - unit testing for Postgres ▶ Docker - easiest way to try out various versions Kaarel Moppel / PGConf APAC 2018 24.03.2018
Rollout support ▶ sqitch - app. integrated migration management ▶ pgdiff - generate diffs from 2 DBS ▶ Pyrseas - generate diffs based on YAML/JSON Kaarel Moppel / PGConf APAC 2018 24.03.2018
Architecture Kaarel Moppel / PGConf APAC 2018 24.03.2018
Scaling ▶ PgBouncer ▶ pgpool-II ▶ Partitioning (builtin in PG10) ▶ Logical Replication / pglogical ▶ Foreign table inheritance Kaarel Moppel / PGConf APAC 2018 24.03.2018
Scaling ▶ cstore_fdw ▶ Citus ▶ Postgres-XL ▶ PL/Proxy ▶ BDR Kaarel Moppel / PGConf APAC 2018 24.03.2018
High Availability ▶ PAF ▶ repmgr ▶ Patroni Kaarel Moppel / PGConf APAC 2018 24.03.2018
Thank you! Kaarel Moppel / PGConf APAC 2018 24.03.2018
Contact us Austria / Switzerland / Estonia www.cybertec-postgresql.com Github: github.com/cybertec-postgresql Twitter: @PostgresSupport Kaarel Moppel / PGConf APAC 2018 24.03.2018

PGConf APAC 2018 - A PostgreSQL DBAs Toolbelt for 2018

  • 1.
    A PostgreSQL DBAsToolbelt for 2018 Kaarel Moppel / PGConf APAC 2018 24.03.2018 Kaarel Moppel / PGConf APAC 2018 24.03.2018
  • 2.
    Disclaimer ▶ Slide contentsfreely usable/publishable under Creative Commons licence ▶ All views are my personal subjective views Kaarel Moppel / PGConf APAC 2018 24.03.2018
  • 3.
    Fields of interestfor DBAs ▶ Setting up Postgres ▶ Monitoring ▶ Daily DBA tasks ▶ Navigating the DB-land ▶ Data integration ▶ Developing ▶ HA ▶ Scaling Kaarel Moppel / PGConf APAC 2018 24.03.2018
  • 4.
    Setting up Postgres KaarelMoppel / PGConf APAC 2018 24.03.2018
  • 5.
    First stop beforeany tools Tools are great but understanding the concepts is still more important. Means provided by the Postgres project should be quite sufficient in half of the cases. ▶ Official documentation ▶ Mailing lists ▶ wiki.postgresql.org Kaarel Moppel / PGConf APAC 2018 24.03.2018
  • 6.
    pgtune Reads postgresql.conf fileand outputs a new, recommended configuration based on your hardware and workload type (Web, OLTP, DW, mixed). ▶ Online version available - http://pgtune.leopard.in.ua/ ▶ New kid on the block - http://pgconfigurator.cybertec.at/ More hints about tuning: ▶ https://wiki.postgresql.org/wiki/Performance_Optimization Kaarel Moppel / PGConf APAC 2018 24.03.2018
  • 7.
    pgbench “Standard” tool forquick hardware capability benchmarking. Bundeled with server binaries. ▶ Measures throughput in terms of TPS ▶ TPC-B like simple banking schema ▶ Custom workloads possible ▶ Foreign keys for more realistic numbers ▶ Play with fillfactor More: ▶ https://wiki.postgresql.org/wiki/Category:Benchmarking Kaarel Moppel / PGConf APAC 2018 24.03.2018
  • 8.
    Backups ▶ pg_dump inparallel mode ▶ wal-e - cloud oriented ▶ wal-r - wal-e improved ▶ pgbarman - incremental backup, continous WAL streaming ▶ pgbackrest - incremental / differential backups, geared for speed and >TB DBs ▶ pitrery - simple PITR Kaarel Moppel / PGConf APAC 2018 24.03.2018
  • 9.
    Monitoring (ad hoc) KaarelMoppel / PGConf APAC 2018 24.03.2018
  • 10.
    Unix/Linux tooling Get acquaintedwith some as not all are available on all platforms https://wiki.postgresql.org/wiki/Performance_Analysis_Tools ▶ top / htop ▶ iotop / dstat Kaarel Moppel / PGConf APAC 2018 24.03.2018
  • 11.
    pg_activity A top likeapplication for PostgreSQL server activity monitoring. Works both locally and remotely. ▶ Blocked/blocking queries view ▶ Terminating backends possible ▶ Some resource usage info available Kaarel Moppel / PGConf APAC 2018 24.03.2018
  • 12.
    pg_view Activity view utilitywith very detailed system information. ▶ Mainly for “on the server” usage ▶ Linux only, but very lightweight ▶ Out-of-disk estimate alert ▶ Different output formats ▶ Detailed resource usage info available Kaarel Moppel / PGConf APAC 2018 24.03.2018
  • 13.
    pgcenter Command line monitoring/managementtool that does almost everything. ▶ “top” features ▶ pg_stat_* monitoring ▶ session termination ▶ server config editing ▶ logfile access Kaarel Moppel / PGConf APAC 2018 24.03.2018
  • 14.
    pg_stat_statements A “must have”for quick problem detection. Enables ordering of often executed queries by avg. runtime, total time, blocks written/read, IO time. ▶ Contrib extension ▶ Minor performance penalty ▶ Anonymized query texts (mostly) Kaarel Moppel / PGConf APAC 2018 24.03.2018
  • 15.
    Monitoring frameworks Kaarel Moppel/ PGConf APAC 2018 24.03.2018
  • 16.
    Generic monitoring frameworks Pluginsto generic monitoring frameworks and APM providers - Nagios, Icinga, Munin, AppDynamics, New Relic, Datadog, etc… ▶ Some monitoring comes built in with cloud providers like AWS RDS and Heroku ▶ Most commercial providers have black-box approach Kaarel Moppel / PGConf APAC 2018 24.03.2018
  • 17.
    check_postgres A Perl scriptwith around 50 checks (actions) to answer most important questions around Postgres - is DB answering, how many sessions, longest/idle tx time, last checkpoint, bloat percentage, etc. ▶ Integrates with Nagios and some other general systems monitoring frameworks ▶ Knows quite a bit about Postgres ecosystem (PgBouncer, PgAgent, Slony) ▶ Can be well used from command line for ad hoc stuff Kaarel Moppel / PGConf APAC 2018 24.03.2018
  • 18.
    check_pgactivity Nagios plugin. Similarto check_postgres but with less checks (services). ▶ Actively maintained Kaarel Moppel / PGConf APAC 2018 24.03.2018
  • 19.
    PoWa Integrated data collectionin server + external webapp ▶ pg_stat_statements + some custom data gatherers ▶ index suggestions! ▶ Not so light / dev. friendly - needs some extensions and setup Kaarel Moppel / PGConf APAC 2018 24.03.2018
  • 20.
    pgwatch2 A new toolfrom Cybertec with minimalistic but flexible approach, relies heavily on state-of-the-art components like Grafana and InfluxDB. ▶ Almost all of pg_stat_* + some custom data gatherers ▶ User extensible metrics collection with SQL ▶ Developer friendly - no extensions or special rights, Docker first Kaarel Moppel / PGConf APAC 2018 24.03.2018
  • 21.
    pghero Health checks, suggestedindexes, and more ▶ Seems to be the most popular Postgres monitoring project on Github ▶ Thresholds and limits hardcoded ▶ Docker for quick testing Kaarel Moppel / PGConf APAC 2018 24.03.2018
  • 22.
    Tools for commontasks Kaarel Moppel / PGConf APAC 2018 24.03.2018
  • 23.
    Snippets There’s a tonof snippets from Postgresql Wiki to help with daily work. Might not always be 100% up to date but mostly nicely commented. ▶ Deleting duplicates ▶ Fixing sequences ▶ Finding / deleting dependencies ▶ Invalid / unused / duplicate indexes ▶ Unindexed foreign keys ▶ Bloat Kaarel Moppel / PGConf APAC 2018 24.03.2018
  • 24.
    Other script-sets Activity, biggesttables, indexing problems, bloat, locking, foreign keys, … ▶ https://github.com/pgexperts/pgx_scripts ▶ https://github.com/dataegret/pg-utils Kaarel Moppel / PGConf APAC 2018 24.03.2018
  • 25.
    Log analyzing Kaarel Moppel/ PGConf APAC 2018 24.03.2018
  • 26.
    DIY with standardLinux tools ▶ grep ▶ awk ▶ rsync Kaarel Moppel / PGConf APAC 2018 24.03.2018
  • 27.
    pgBadger “de facto” loganalyzer for PostgreSQL producing nice HTML pages with graphs. Dozens and dozens of options. ▶ Slowest/most frequent/most time consuming queries ▶ Sessions / locking / temp files ▶ TPS / query type distribution ▶ parallel/incremental processing ▶ PgBouncer logfiles support Kaarel Moppel / PGConf APAC 2018 24.03.2018
  • 28.
    Other approaches ▶ Cloudprovider plugins - Loggly, Splunk ▶ redislog extension for the ELK stack ▶ Importing directly into Postgres tables with COPY (easiest with CVS format) ▶ Exposing logfiles via file_fdw Kaarel Moppel / PGConf APAC 2018 24.03.2018
  • 29.
    Indexing & bloat KaarelMoppel / PGConf APAC 2018 24.03.2018
  • 30.
    Indexing ▶ Snippetes fromwiki.postgresql.org ▶ hypopg - extension for building “hypothetical” indexes ▶ pghero - index suggestions. Uses query parser from https://github.com/lfittl/pg_query ▶ pg_qualstats Kaarel Moppel / PGConf APAC 2018 24.03.2018
  • 31.
    Bloat Kaarel Moppel /PGConf APAC 2018 24.03.2018
  • 32.
    Spotting bloat ▶ “pgstattuple”contrib extension ▶ SQL Snippets from wiki.postgresql.org ▶ check_pgactivity - https://github.com/OPMDG/check_pgactivity ▶ Somekind of indicators from most monitoring tools Kaarel Moppel / PGConf APAC 2018 24.03.2018
  • 33.
    Removing bloat ▶ VACUUMFULL / CLUSTER ▶ pg_repack ▶ pg_squeeze ▶ pgcompacttable Kaarel Moppel / PGConf APAC 2018 24.03.2018
  • 34.
    Working with data KaarelMoppel / PGConf APAC 2018 24.03.2018
  • 35.
    Navigating Postgres DBs ▶psql ▶ .psqlrc (per version / DB) ▶ variable interpolation ▶ crosstab ▶ if/else (PG10) ▶ https://github.com/okbob/pspg ▶ libpq service files (~/.pg_service.conf) ▶ psql_switch ▶ pgcli ▶ widely popular and actively developed ▶ code completes also functions! ▶ chokes noticably on big datasets Kaarel Moppel / PGConf APAC 2018 24.03.2018
  • 36.
    Navigating Postgres DBsgraphically ▶ pgadmin3/4 ▶ DataGrip ▶ pgModeler Kaarel Moppel / PGConf APAC 2018 24.03.2018
  • 37.
    Making data available ▶redash - visualize and share your data ▶ SqlPad ▶ save and share queries + basic charting ▶ RESTful Postgres ▶ https://github.com/begriffs/postgrest Kaarel Moppel / PGConf APAC 2018 24.03.2018
  • 38.
    Queries over manyPostgres DBs ▶ dblink ▶ postgres_fdw ▶ PrestoDB Kaarel Moppel / PGConf APAC 2018 24.03.2018
  • 39.
    Data integrations ▶ COPY ▶file_fdw (“program” option in PG10) ▶ Foreign Data Wrappers ▶ Your proprietary database / NoSQL store is most probably already supported ▶ https://wiki.postgresql.org/wiki/Foreign_data_wrappers ▶ pgloader - 1-click MySQL migrations, CSV Kaarel Moppel / PGConf APAC 2018 24.03.2018
  • 40.
    Helper tools fordevelopers ▶ https://explain.depesz.com/ ▶ sqlparse - A non-validating SQL parser module for Python. Supports plpgsql, tokenizing, extracting single statements, formatting with customizations. ▶ pgFormatter - https://github.com/darold/pgFormatter ▶ audit-trigger ▶ pgaudit ▶ pg_partman* Kaarel Moppel / PGConf APAC 2018 24.03.2018
  • 41.
    Helper tools fordevelopers ▶ plpgsql_check - extra validations for your sprocs ▶ plprofiler - per line code coverage and performance info ▶ pldebugger - step through code visually in PgAdmin ▶ pgTap - unit testing for Postgres ▶ Docker - easiest way to try out various versions Kaarel Moppel / PGConf APAC 2018 24.03.2018
  • 42.
    Rollout support ▶ sqitch- app. integrated migration management ▶ pgdiff - generate diffs from 2 DBS ▶ Pyrseas - generate diffs based on YAML/JSON Kaarel Moppel / PGConf APAC 2018 24.03.2018
  • 43.
    Architecture Kaarel Moppel /PGConf APAC 2018 24.03.2018
  • 44.
    Scaling ▶ PgBouncer ▶ pgpool-II ▶Partitioning (builtin in PG10) ▶ Logical Replication / pglogical ▶ Foreign table inheritance Kaarel Moppel / PGConf APAC 2018 24.03.2018
  • 45.
    Scaling ▶ cstore_fdw ▶ Citus ▶Postgres-XL ▶ PL/Proxy ▶ BDR Kaarel Moppel / PGConf APAC 2018 24.03.2018
  • 46.
    High Availability ▶ PAF ▶repmgr ▶ Patroni Kaarel Moppel / PGConf APAC 2018 24.03.2018
  • 47.
    Thank you! Kaarel Moppel/ PGConf APAC 2018 24.03.2018
  • 48.
    Contact us Austria /Switzerland / Estonia www.cybertec-postgresql.com Github: github.com/cybertec-postgresql Twitter: @PostgresSupport Kaarel Moppel / PGConf APAC 2018 24.03.2018