© 2015 EnterpriseDB Corporation. All rights reserved. 1 The Central View of your Data with Postgres To listen to the recording of this presentation visit EnterpriseDB > Resources > Webcasts > On demand Webcasts
© 2015 EnterpriseDB Corporation. All rights reserved. 2 •  Different applications generate different types of data •  And various database technologies have evolved as efficient ways to store that data Data Is Everywhere Web Logs Sensor Data Customer Interactions Purchase History Transaction Logs MongoDB Hadoop Cassandra RedisPostgreSQL CouchDB Social Media THERE IS NO CENTRAL VIEW!
© 2015 EnterpriseDB Corporation. All rights reserved. 3 •  Data silos are inevitable. •  IT needs to be more agile in response to business requests for new views of data. •  Eventually, silos need to be joined with information from the system of record to provide value. Sample Use Cases •  Customer 360 •  Identification of regulated information •  Risk analysis •  Management of contract data •  Sensor collected data Why Do We Need A Central View?
© 2015 EnterpriseDB Corporation. All rights reserved. 4 In an Ideal World, You Would Have a Centralized View Customer Interactions OLTP Live Data Purchase History The Logical Data Warehouse Web LogsTransaction Logs •  We call this the Logical Data Warehouse
© 2015 EnterpriseDB Corporation. All rights reserved. 5 •  Core PostgeSQL features to support more diverse workloads −  Flexible Datatypes – JSON / JSONB and Key Value Store −  Unlogged tables to improve performance •  Foreign Data Wrappers (FDW) −  Use PostgreSQL as a central interface to connect to other systems to gather data and issue queries or joins −  Push-down for where and columns improve performance •  Postgres Plus features −  Resource Management to more effectively run mixed workloads −  EDB*Loader to load data from various sources EnterpriseDB’s Strategy Around Centralizing Data Focuses on 3 Areas
© 2015 EnterpriseDB Corporation. All rights reserved. 6 •  HSTORE −  Key-value pair −  Simple, fast and easy −  Postgres v 8.2 – pre-dates many NoSQL-only solutions −  Ideal for flat data structures that are sparsely populated •  JSON −  Hierarchical document model −  Introduced in Postgres 9.2, perfected in 9.3 •  JSONB −  Binary version of JSON −  Faster, more operators and even more robust −  Postgres 9.4 Postgres’ Provides Flexible Data Types
© 2015 EnterpriseDB Corporation. All rights reserved. 7 •  Creating a table with a JSONB field CREATE TABLE json_data (data JSONB);! •  Simple JSON data element: {"name": "Apple Phone", "type": "phone", "brand": "ACME", "price": 200, "available": true, "warranty_years": 1}! •  Inserting this data element into the table json_data INSERT INTO json_data (data) VALUES ! !(’ { !"name": "Apple Phone", ! ! !"type": "phone", ! ! !"brand": "ACME", ! ! !"price": 200, ! ! !"available": true, ! ! !"warranty_years": 1 ! !! !} ')! JSON Examples
© 2015 EnterpriseDB Corporation. All rights reserved. 8 SELECT DISTINCT ! !data->>'name' as products ! FROM json_data;
 ! products ! ------------------------------! Cable TV Basic Service Package! AC3 Case Black! Phone Service Basic Plan! AC3 Phone! AC3 Case Green! Phone Service Family Plan! AC3 Case Red! AC7 Phone! A simple query for JSON data This query does not return JSON data – it returns text values associated with the key ‘name’
© 2015 EnterpriseDB Corporation. All rights reserved. 9 SELECT DISTINCT product_type, data->>'brand' as Brand, data->>'available' as Availability FROM json_data JOIN products ON (products.product_type=json_data.data->>'name') WHERE json_data.data->>'available'=true; product_type | brand | availability ---------------------------+-----------+-------------- AC3 Phone | ACME | true JSON and ANSI SQL Example ANSI SQL JSON No need for programmatic logic to combine SQL and NoSQL in the application – Postgres does it all
© 2015 EnterpriseDB Corporation. All rights reserved. 10 •  Every write in PostgreSQL is essentially two writes due to Write Ahead Log (WAL) −  WAL guarantees durability and support replication •  Unlogged tables are freed from this constraint −  But tables are no longer crash safe! •  Can see good performance gain (~13-17%) Unlogged Tables Improve Performance
© 2015 EnterpriseDB Corporation. All rights reserved. 11 •  Make external data sources look like local tables •  Use SQL −  SELECT syntax; including useful clauses like DISTINCT, ORDER BY, GROUP BY and more. −  JOIN external data with internal tables −  FUNCTIONS for comparison, math, string, pattern matching, date/time, etc −  Starting in 9.3 - INSERT / UPDATE / DELETE too •  Predicate pushdown – Filter data on remote sources first! −  SELECT and WHERE clauses today −  Roadmap - Join, Group/Aggregate, Sort and Limit Some Background on Foreign Data Wrappers
© 2015 EnterpriseDB Corporation. All rights reserved. 12 https://www.github.com/EnterpriseDB •  We have contributed new Mongo FDW, MySQL FDW and HDFS (Hadoop) FDW •  We focused on adding Enterprise capabilities −  Connecting Pooling −  Predicate pushdowns −  Where clause pushdown −  Column pushdown −  Writeablity (DML support) EnterpriseDB is investing in Foreign Data Wrappers
© 2015 EnterpriseDB Corporation. All rights reserved. 13 The HDFS_FDW Postgres Plus Postgres HDFS_FDW
© 2015 EnterpriseDB Corporation. All rights reserved. 14 Example usage from HDFS_FDW
© 2015 EnterpriseDB Corporation. All rights reserved. 15 Example usage from HDFS_FDW
© 2015 EnterpriseDB Corporation. All rights reserved. 16 Current Release •  Alpha Release – February 2015 •  Support for HiveServer 1 & 2 •  Complete Select support •  Pushdown features (Where clause and query columns) •  Connection pooling •  Use of map reduce jobs for aggregates and complex query operations. Roadmap •  Writeability via Hbase •  Support for Flum/Impala server •  More pushdown features (joins, aggregates, sort etc) •  Custom Map Reduce Jobs •  Authentication support •  Installers to be provided via EnterpriseDB’s GUI Installer, RPM and DEB packages EnterpriseDB’s HDFS_FDW is Available and Improvements Planned
© 2015 EnterpriseDB Corporation. All rights reserved. 17 Postgres Plus Advanced Server Resource Manager (CPU & I/O) Reporting Transactions 80% 20% Run Mixed Workloads More Efficiently with PPAS 9.4 Resource Management •  DBA assigns CPU & I/O to job groups •  Allocates and prioritizes consumption of resources •  Low priority jobs don’t hurt high priority jobs
© 2015 EnterpriseDB Corporation. All rights reserved. 18 •  Create Resource Groups and assign the CPU Rate Limit •  Use these resource groups during a psql session Run Mixed Workloads More Efficiently with PPAS 9.4 Resource Management - Statements executed will be limited in CPU or writing to shared_buffers per resource group. - Individual limits are computed based on recent CPU / shared_buffer usage. - Processes sleep when necessary & avoid sleep when holding critical locks. - The limits are adjusted regularly based on current usage. - If multiple processes in the same group are being executed, aggregate usage will be limited CREATE RESOURCE GROUP resgrp_a; CREATE RESOURCE GROUP resgrp_b; ALTER RESOURCE GROUP resgrp_a SET cpu_rate_limit = .25; ALTER RESOURCE GROUP resgrp_a SET dirty_rate_limit = 12288; SET edb_resource_group TO res_grp_a;
© 2015 EnterpriseDB Corporation. All rights reserved. 19 •  Conventional path, direct path, and parallel direct load loading methods •  Data loading from standard input and remote loading, particularly useful for large data sources on remote hosts •  Input data with delimiter-separated or fixed-width fields •  Bad file for collecting rejected records •  Discard file for collecting records that do not meet the selection criteria of any target table •  Log file for recording the EDB*Loader session and any error messages Efficiently load data using EDB*Loader
© 2015 EnterpriseDB Corporation. All rights reserved. 20 •  Core PostgeSQL features to support more diverse workloads −  Flexible Datatypes – JSON / JSONB and Key Value Store −  Unlogged tables to improve performance •  Foreign Data Wrappers (FDW) −  Use PostgreSQL as a central interface to connect to other systems to gather data and issue queries or joins −  Push-down for where and columns improve performance •  Postgres Plus features −  Resource Management to more effectively run mixed workloads −  EDB*Loader to load data from various sources Recap: EnterpriseDB’s Strategy Around Centralizing Data Focuses on 3 Areas
© 2015 EnterpriseDB Corporation. All rights reserved. 21

The Central View of your Data with Postgres

  • 1.
    © 2015 EnterpriseDBCorporation. All rights reserved. 1 The Central View of your Data with Postgres To listen to the recording of this presentation visit EnterpriseDB > Resources > Webcasts > On demand Webcasts
  • 2.
    © 2015 EnterpriseDBCorporation. All rights reserved. 2 •  Different applications generate different types of data •  And various database technologies have evolved as efficient ways to store that data Data Is Everywhere Web Logs Sensor Data Customer Interactions Purchase History Transaction Logs MongoDB Hadoop Cassandra RedisPostgreSQL CouchDB Social Media THERE IS NO CENTRAL VIEW!
  • 3.
    © 2015 EnterpriseDBCorporation. All rights reserved. 3 •  Data silos are inevitable. •  IT needs to be more agile in response to business requests for new views of data. •  Eventually, silos need to be joined with information from the system of record to provide value. Sample Use Cases •  Customer 360 •  Identification of regulated information •  Risk analysis •  Management of contract data •  Sensor collected data Why Do We Need A Central View?
  • 4.
    © 2015 EnterpriseDBCorporation. All rights reserved. 4 In an Ideal World, You Would Have a Centralized View Customer Interactions OLTP Live Data Purchase History The Logical Data Warehouse Web LogsTransaction Logs •  We call this the Logical Data Warehouse
  • 5.
    © 2015 EnterpriseDBCorporation. All rights reserved. 5 •  Core PostgeSQL features to support more diverse workloads −  Flexible Datatypes – JSON / JSONB and Key Value Store −  Unlogged tables to improve performance •  Foreign Data Wrappers (FDW) −  Use PostgreSQL as a central interface to connect to other systems to gather data and issue queries or joins −  Push-down for where and columns improve performance •  Postgres Plus features −  Resource Management to more effectively run mixed workloads −  EDB*Loader to load data from various sources EnterpriseDB’s Strategy Around Centralizing Data Focuses on 3 Areas
  • 6.
    © 2015 EnterpriseDBCorporation. All rights reserved. 6 •  HSTORE −  Key-value pair −  Simple, fast and easy −  Postgres v 8.2 – pre-dates many NoSQL-only solutions −  Ideal for flat data structures that are sparsely populated •  JSON −  Hierarchical document model −  Introduced in Postgres 9.2, perfected in 9.3 •  JSONB −  Binary version of JSON −  Faster, more operators and even more robust −  Postgres 9.4 Postgres’ Provides Flexible Data Types
  • 7.
    © 2015 EnterpriseDBCorporation. All rights reserved. 7 •  Creating a table with a JSONB field CREATE TABLE json_data (data JSONB);! •  Simple JSON data element: {"name": "Apple Phone", "type": "phone", "brand": "ACME", "price": 200, "available": true, "warranty_years": 1}! •  Inserting this data element into the table json_data INSERT INTO json_data (data) VALUES ! !(’ { !"name": "Apple Phone", ! ! !"type": "phone", ! ! !"brand": "ACME", ! ! !"price": 200, ! ! !"available": true, ! ! !"warranty_years": 1 ! !! !} ')! JSON Examples
  • 8.
    © 2015 EnterpriseDBCorporation. All rights reserved. 8 SELECT DISTINCT ! !data->>'name' as products ! FROM json_data;
 ! products ! ------------------------------! Cable TV Basic Service Package! AC3 Case Black! Phone Service Basic Plan! AC3 Phone! AC3 Case Green! Phone Service Family Plan! AC3 Case Red! AC7 Phone! A simple query for JSON data This query does not return JSON data – it returns text values associated with the key ‘name’
  • 9.
    © 2015 EnterpriseDBCorporation. All rights reserved. 9 SELECT DISTINCT product_type, data->>'brand' as Brand, data->>'available' as Availability FROM json_data JOIN products ON (products.product_type=json_data.data->>'name') WHERE json_data.data->>'available'=true; product_type | brand | availability ---------------------------+-----------+-------------- AC3 Phone | ACME | true JSON and ANSI SQL Example ANSI SQL JSON No need for programmatic logic to combine SQL and NoSQL in the application – Postgres does it all
  • 10.
    © 2015 EnterpriseDBCorporation. All rights reserved. 10 •  Every write in PostgreSQL is essentially two writes due to Write Ahead Log (WAL) −  WAL guarantees durability and support replication •  Unlogged tables are freed from this constraint −  But tables are no longer crash safe! •  Can see good performance gain (~13-17%) Unlogged Tables Improve Performance
  • 11.
    © 2015 EnterpriseDBCorporation. All rights reserved. 11 •  Make external data sources look like local tables •  Use SQL −  SELECT syntax; including useful clauses like DISTINCT, ORDER BY, GROUP BY and more. −  JOIN external data with internal tables −  FUNCTIONS for comparison, math, string, pattern matching, date/time, etc −  Starting in 9.3 - INSERT / UPDATE / DELETE too •  Predicate pushdown – Filter data on remote sources first! −  SELECT and WHERE clauses today −  Roadmap - Join, Group/Aggregate, Sort and Limit Some Background on Foreign Data Wrappers
  • 12.
    © 2015 EnterpriseDBCorporation. All rights reserved. 12 https://www.github.com/EnterpriseDB •  We have contributed new Mongo FDW, MySQL FDW and HDFS (Hadoop) FDW •  We focused on adding Enterprise capabilities −  Connecting Pooling −  Predicate pushdowns −  Where clause pushdown −  Column pushdown −  Writeablity (DML support) EnterpriseDB is investing in Foreign Data Wrappers
  • 13.
    © 2015 EnterpriseDBCorporation. All rights reserved. 13 The HDFS_FDW Postgres Plus Postgres HDFS_FDW
  • 14.
    © 2015 EnterpriseDBCorporation. All rights reserved. 14 Example usage from HDFS_FDW
  • 15.
    © 2015 EnterpriseDBCorporation. All rights reserved. 15 Example usage from HDFS_FDW
  • 16.
    © 2015 EnterpriseDBCorporation. All rights reserved. 16 Current Release •  Alpha Release – February 2015 •  Support for HiveServer 1 & 2 •  Complete Select support •  Pushdown features (Where clause and query columns) •  Connection pooling •  Use of map reduce jobs for aggregates and complex query operations. Roadmap •  Writeability via Hbase •  Support for Flum/Impala server •  More pushdown features (joins, aggregates, sort etc) •  Custom Map Reduce Jobs •  Authentication support •  Installers to be provided via EnterpriseDB’s GUI Installer, RPM and DEB packages EnterpriseDB’s HDFS_FDW is Available and Improvements Planned
  • 17.
    © 2015 EnterpriseDBCorporation. All rights reserved. 17 Postgres Plus Advanced Server Resource Manager (CPU & I/O) Reporting Transactions 80% 20% Run Mixed Workloads More Efficiently with PPAS 9.4 Resource Management •  DBA assigns CPU & I/O to job groups •  Allocates and prioritizes consumption of resources •  Low priority jobs don’t hurt high priority jobs
  • 18.
    © 2015 EnterpriseDBCorporation. All rights reserved. 18 •  Create Resource Groups and assign the CPU Rate Limit •  Use these resource groups during a psql session Run Mixed Workloads More Efficiently with PPAS 9.4 Resource Management - Statements executed will be limited in CPU or writing to shared_buffers per resource group. - Individual limits are computed based on recent CPU / shared_buffer usage. - Processes sleep when necessary & avoid sleep when holding critical locks. - The limits are adjusted regularly based on current usage. - If multiple processes in the same group are being executed, aggregate usage will be limited CREATE RESOURCE GROUP resgrp_a; CREATE RESOURCE GROUP resgrp_b; ALTER RESOURCE GROUP resgrp_a SET cpu_rate_limit = .25; ALTER RESOURCE GROUP resgrp_a SET dirty_rate_limit = 12288; SET edb_resource_group TO res_grp_a;
  • 19.
    © 2015 EnterpriseDBCorporation. All rights reserved. 19 •  Conventional path, direct path, and parallel direct load loading methods •  Data loading from standard input and remote loading, particularly useful for large data sources on remote hosts •  Input data with delimiter-separated or fixed-width fields •  Bad file for collecting rejected records •  Discard file for collecting records that do not meet the selection criteria of any target table •  Log file for recording the EDB*Loader session and any error messages Efficiently load data using EDB*Loader
  • 20.
    © 2015 EnterpriseDBCorporation. All rights reserved. 20 •  Core PostgeSQL features to support more diverse workloads −  Flexible Datatypes – JSON / JSONB and Key Value Store −  Unlogged tables to improve performance •  Foreign Data Wrappers (FDW) −  Use PostgreSQL as a central interface to connect to other systems to gather data and issue queries or joins −  Push-down for where and columns improve performance •  Postgres Plus features −  Resource Management to more effectively run mixed workloads −  EDB*Loader to load data from various sources Recap: EnterpriseDB’s Strategy Around Centralizing Data Focuses on 3 Areas
  • 21.
    © 2015 EnterpriseDBCorporation. All rights reserved. 21