10 things, an Oracle DBA should care about when moving to PostgreSQL Ilya Kosmodemiansky ik@postgresql-consulting.com
Well, working with Oracle... • You definitely have some ideas how a database should looks like • OK, PostgreSQL (unintended) follows that ideas quit good • ...at least much better than DB2 or SQL Server • But in spite of all these
Moving to Postgres You can have at least two issues • How to transfer your Oracle best practices to PostgrSQL • What kind of workload can Postgres manage in comparison with Oracle • Those issues are quite tightly interconnected • Let’s start from simple things
1. Oracle recommends SGA as a 40% of RAM • PostgreSQL is always in dedicated mode • Closest to SGA PostgreSQL setting is shared_buffers • But Postgres uses buffered IO, so 40% is a worst case in terms of double caching • Rule of thumb is 25% - together with kernel buffer it will be 50% • But if your database definitely fits into memory - it can be up to 75%. That means you should not worry about double caching trade off
2. I need to secure my listener • PostgreSQL’s listener is pgbouncer - a connection pooler • It is not in the box, you need to install it in 99,9% of cases • It is pretty well secured (Postgres as well) • But you need to configure it • Most likely you will need pool_mode=transaction with pool size about 10-80 and max_client_conn up to 10000 • With such setting you will definitely need to increase ulimit on Linux/Unix
3. Is there some RMAN analogue? • There is no RMAN from the box, you simply use some command-line tools. In Postgres we call all of this PITR • You can call pg_basebackup or perform the same thing manually using for example rsync • Effectively, you perform level 0 backup and can use archive logs to recover database to some incarnation • Everything is very simple: small recovery.conf in datadir commands Postgres to perform recovery automatically while starting up • There is no crosscheck. Take in mind your oracle best practice: test recovery is the only kind of guarantee
4. What should I know about Redo? • Redo logs we call WAL, they are in pg_xlog directory and the are redo-logs (do not contain undo information) • No groups, no multiplexing: different mechanism of writing and archiving log • You can move them to another device, just use simlink
5. What should I know about Undo? • There is no Undo segments, undo data located in datafiles • It is convenient: no ORA-01555 • But you need garbage collection, we call it autovacuum. • Do not oversee autovacuum! Never turn it off • You need to tune autovacuum a bit, but it is a separate talk • One by me myself http://goo.gl/Tq0YzT • A good one by Jim Nasby http://goo.gl/K0PDtq
6. What about RAC analogue? • RAC is a very Oracle thing • The idea of multinode ”cluster” with single bottleneck (storage) possibly will be never implemented in PostgreSQL (and for the reason) • There are several true-2PC clusters in Postgres and counting (take a look at least on BDR, Postgres-XC/XL if you are sure you need it)
7. What about Active DataGuard analogue? • Actually my favorite high availability approach • Streaming replication, HotStandby. Basically it is a physical standby in Oracle terms. • Logical is on the way (as a DataGuard Logical mode analogue), but you can always use Slony • It is much more simple to manage in PostgreSQL!
8. Partitioning • OK, there is no Oracle-like partitioning right now • You can use inheritance or do-it-yourself partitioning • Take in mind: Keep the amount of partitions small - archive old partitions Keep PostgreSQL version fresh
9. Main IO performance problem • Usually not redo log writing or archiving • May be number one - checkpoint IO issue • Buffered IO + single checkpointer process = all dirty pages from shared_buffers will hit your disks • Be prepared: Read carefully all docs about checkpoints You need to understand Linux operations as weel Again - checklist talk http://goo.gl/b7wjDC
10. What kind of workload PostgreSQL can handle? Well, a nuclear submarine is the modern engeneering marwell. But a bit expensive and complicated to operate to use it for weekend-sailing or even as a cargo ship • Virtually one and every if you are prepared - well enough tool • May be not a 200-300K random updates per second on a single machine It is not an easy task in Oracle too Price for various sorts of clustering is much lower than in Oracle There are examples of handling 10-50K random updates on single server - are you sure you need more? • PostgreSQL is really easy to use database
Questions? ik@postgresql-consulting.com

10 things, an Oracle DBA should care about when moving to PostgreSQL

  • 1.
    10 things, anOracle DBA should care about when moving to PostgreSQL Ilya Kosmodemiansky ik@postgresql-consulting.com
  • 2.
    Well, working withOracle... • You definitely have some ideas how a database should looks like • OK, PostgreSQL (unintended) follows that ideas quit good • ...at least much better than DB2 or SQL Server • But in spite of all these
  • 3.
    Moving to Postgres Youcan have at least two issues • How to transfer your Oracle best practices to PostgrSQL • What kind of workload can Postgres manage in comparison with Oracle • Those issues are quite tightly interconnected • Let’s start from simple things
  • 4.
    1. Oracle recommendsSGA as a 40% of RAM • PostgreSQL is always in dedicated mode • Closest to SGA PostgreSQL setting is shared_buffers • But Postgres uses buffered IO, so 40% is a worst case in terms of double caching • Rule of thumb is 25% - together with kernel buffer it will be 50% • But if your database definitely fits into memory - it can be up to 75%. That means you should not worry about double caching trade off
  • 5.
    2. I needto secure my listener • PostgreSQL’s listener is pgbouncer - a connection pooler • It is not in the box, you need to install it in 99,9% of cases • It is pretty well secured (Postgres as well) • But you need to configure it • Most likely you will need pool_mode=transaction with pool size about 10-80 and max_client_conn up to 10000 • With such setting you will definitely need to increase ulimit on Linux/Unix
  • 6.
    3. Is theresome RMAN analogue? • There is no RMAN from the box, you simply use some command-line tools. In Postgres we call all of this PITR • You can call pg_basebackup or perform the same thing manually using for example rsync • Effectively, you perform level 0 backup and can use archive logs to recover database to some incarnation • Everything is very simple: small recovery.conf in datadir commands Postgres to perform recovery automatically while starting up • There is no crosscheck. Take in mind your oracle best practice: test recovery is the only kind of guarantee
  • 7.
    4. What shouldI know about Redo? • Redo logs we call WAL, they are in pg_xlog directory and the are redo-logs (do not contain undo information) • No groups, no multiplexing: different mechanism of writing and archiving log • You can move them to another device, just use simlink
  • 8.
    5. What shouldI know about Undo? • There is no Undo segments, undo data located in datafiles • It is convenient: no ORA-01555 • But you need garbage collection, we call it autovacuum. • Do not oversee autovacuum! Never turn it off • You need to tune autovacuum a bit, but it is a separate talk • One by me myself http://goo.gl/Tq0YzT • A good one by Jim Nasby http://goo.gl/K0PDtq
  • 9.
    6. What aboutRAC analogue? • RAC is a very Oracle thing • The idea of multinode ”cluster” with single bottleneck (storage) possibly will be never implemented in PostgreSQL (and for the reason) • There are several true-2PC clusters in Postgres and counting (take a look at least on BDR, Postgres-XC/XL if you are sure you need it)
  • 10.
    7. What aboutActive DataGuard analogue? • Actually my favorite high availability approach • Streaming replication, HotStandby. Basically it is a physical standby in Oracle terms. • Logical is on the way (as a DataGuard Logical mode analogue), but you can always use Slony • It is much more simple to manage in PostgreSQL!
  • 11.
    8. Partitioning • OK,there is no Oracle-like partitioning right now • You can use inheritance or do-it-yourself partitioning • Take in mind: Keep the amount of partitions small - archive old partitions Keep PostgreSQL version fresh
  • 12.
    9. Main IOperformance problem • Usually not redo log writing or archiving • May be number one - checkpoint IO issue • Buffered IO + single checkpointer process = all dirty pages from shared_buffers will hit your disks • Be prepared: Read carefully all docs about checkpoints You need to understand Linux operations as weel Again - checklist talk http://goo.gl/b7wjDC
  • 13.
    10. What kindof workload PostgreSQL can handle? Well, a nuclear submarine is the modern engeneering marwell. But a bit expensive and complicated to operate to use it for weekend-sailing or even as a cargo ship • Virtually one and every if you are prepared - well enough tool • May be not a 200-300K random updates per second on a single machine It is not an easy task in Oracle too Price for various sorts of clustering is much lower than in Oracle There are examples of handling 10-50K random updates on single server - are you sure you need more? • PostgreSQL is really easy to use database
  • 14.