Flexible transactional scale for the connected world. Demystifying Benchmarks How to Use Them to Better Evaluate Databases Peter Friedenbach, Performance Architect | Clustrix
Outline o A Brief History of Database Benchmarks o A Brief Introduction to Open Source Database Benchmark Tools o How to Evaluate Database Performance – Best Practices and Lessons Learned 2 Demystifying Benchmarks
First, a little history… In the beginning, there was the “TPS Wars” 3
o Published in 1985 by Anon. et. Al (Jim Gray) o First publication of a database performance benchmark. o Novelties of DebitCredit: – ACID property transactions – Price/Performance – Response Time Constraints – Database Scaling Rules Debit/Credit: The 1st Attempt 4
o Transaction Processing Performance Council – Founded in 1988, the TPC was chartered to established industry standards out of the madness. TPC Timeline: 1988 : TPC Established 1989 : TPC-A Approved (Formalizes Debit/Credit) 1990 : TPC-B Approved (Database only version of TPC-A) 1992 : TPC-C Approved (Replaces TPC-A OLTP workload) 1995 : TPC-D Approved (1st Decision Support Workload) TPC-A & TPC-B retired. 1999 : TPC-H Approved (Replaces TPC-D Workload) 2000 : TPC-C v5 Approved (Major revision to TPC-C) 2006 : TPC-E Approved (Next Generation OLTP workload) 2009 : First TPC Technology Conference on Performance Evaluation & Benchmark (Held in conjunction with VLDB) 2012 : TPC-VMS Approved (1st Virtualized Database benchmark) 2014 : TPCx-HS Approved (1st Hadoop based benchmark) 2015 : TPC-DS Approved (Next Generation Decision Support Benchmark) 2016 : TPCx-V Approved (Virtual DB Server benchmark) 2016 : TPCx-BB Published (Hadoop Big Data benchmark) The Call for an Industry Standard 5
o The Good: – Established the rules to the game – For the first time, competitive performance claims could be compared – Audited results – Standard workloads focused the industry and drove performance improvements Transaction Processing Performance Council 6 o The Bad: – Expensive to play – “Benchmarketing” and gamesmanship – Dominated by vendors • Hardware vendors • Database vendors – Slow to evolve to a changing marketplace.
The Impact of the TPC 7
What happened to the TPC? 8
o Sysbench – Open source toolkit • Moderated by Alexey Kopytov at Percona – Implements multiple workloads designed to test the CPU, disk, memory, and database capabilities of a system – Database workload allows for a mixture of reads (singleton selects and range queries) and writes (inserts, updates, and deletes) – Sysbench is popular in the Mysql Marketplace Open Source Database Benchmarking Tools 9 “sysbench is a modular, cross-platform and multi- threaded benchmark tool for evaluating OS parameters that are important for a system running a database under intensive load.”
o YCSB (Yahoo! Cloud Serving Benchmark) – Open source toolkit • Moderated by Brian Cooper at Google – Multi-threaded driver exercising get and put operations against an object store – YCSB is popular with the NoSQL Marketplace Open Source Database Benchmarking Tools 10 “YCSB is a framework and common set of workloads for evaluating the performance of different “key-value” and “cloud” serving stores.”
o Others – “TPC-like” workloads live on • DBT2 (MySQL Benchmark Tool Kit) • DebitCredit (TPC-A / TPC-B like) • OrderEntry (TPC-C like) – OLTPBench • https://github.com/oltpbenchmark/oltpbench – Others? Open Source Database Benchmarking Tools 11
How to Evaluate Database Performance o #1 – Know your objective. – What are you trying to measure/test? • OLTP Performance: Capacity and Throughput • Data Analytics: Query Performance • Do you need full ACID property transactions? • … And other questions? 12
How to Evaluate Database Performance o #2 – Choose your approach. Option 1: Rely on “published Results” • Words of advice: Trust but verify! • Be wary of competitive benchmark claims • Without the TPC, there is no standard for comparison Option 2: Leverage open source benchmarks • Leverage Sysbench and/or YCSB mixed workloads • Create your own custom mix as appropriate Option 3: Model your own workload (“Proof of Concept”) • Particularly useful if you have existing data and existing query profiles 13
How to Evaluate Database Performance o #3 – A data point is not a curve. (Common mistake.) Performance Curves 14 Throughput Latency(ResponseTime) Performance is a tradeoff of throughput versus latency. Design your tests with a variable in mind.
How to Evaluate Database Performance o #4 – Understand where there’s a bottleneck. (Common mistake.) – Where systems can bottleneck: • Hardware (CPU, disk, network) • Database (internal locks/latches, buffer managers, transaction managers, …) • Application (data locks) • Driver systems 15
How to Evaluate Database Performance o #5 – Limit the number of variables. In any test, there are: – Three fundamental system variables • Hardware, operating system, and database system – Driver mode • On box versus external client – Database design variables • Connectivity (odbc, jdbc, session pools, proprietary techniques, …) • Execution model (session-less, prepare/exec, stored procedures, …) • # of tables, # of columns, types of columns – Multiple test variables • Database scale size • Concurrent sessions/streams • Query Complexity 16 Real performance work is an exercise of control.
How to Evaluate Database Performance o #6 – Scalability testing: The quest for “linear scalability” 17 A workload will scale only if sized appropriately.
How to Evaluate Database Performance o #7 – The myth of “representable workloads.” Benchmarks are not “representable workloads” – The complexity of the benchmark does not determine the goodness of the benchmark. 18 Good benchmark performance is “necessary but not sufficient” for good application performance.
How I Use the Tools Available o To access “system” health, I use: – CPU Processor: sysbench cpu – Disk Subsystem: sysbench fileio – Network Subsystem: iperf o To access “database” health, I use: – Sysbench for ACID transactions: point selects, point updates, and simple mixes (90:10 or 80:20) – YCSB for nonACID transactions: workloadc (readonly), workloada and workloadb (read/write mixes) o To access “database” transaction capabilities, I use: – DebitCredit and OrderEntry (“TPC like database only workloads.”) o To model application specific problems I will sometimes leverage any or all of the above. 19
Demystifying Benchmarks: How to Use Them To Better Evaluate Databases

Demystifying Benchmarks: How to Use Them To Better Evaluate Databases

  • 1.
    Flexible transactional scalefor the connected world. Demystifying Benchmarks How to Use Them to Better Evaluate Databases Peter Friedenbach, Performance Architect | Clustrix
  • 2.
    Outline o A BriefHistory of Database Benchmarks o A Brief Introduction to Open Source Database Benchmark Tools o How to Evaluate Database Performance – Best Practices and Lessons Learned 2 Demystifying Benchmarks
  • 3.
    First, a littlehistory… In the beginning, there was the “TPS Wars” 3
  • 4.
    o Published in1985 by Anon. et. Al (Jim Gray) o First publication of a database performance benchmark. o Novelties of DebitCredit: – ACID property transactions – Price/Performance – Response Time Constraints – Database Scaling Rules Debit/Credit: The 1st Attempt 4
  • 5.
    o Transaction Processing PerformanceCouncil – Founded in 1988, the TPC was chartered to established industry standards out of the madness. TPC Timeline: 1988 : TPC Established 1989 : TPC-A Approved (Formalizes Debit/Credit) 1990 : TPC-B Approved (Database only version of TPC-A) 1992 : TPC-C Approved (Replaces TPC-A OLTP workload) 1995 : TPC-D Approved (1st Decision Support Workload) TPC-A & TPC-B retired. 1999 : TPC-H Approved (Replaces TPC-D Workload) 2000 : TPC-C v5 Approved (Major revision to TPC-C) 2006 : TPC-E Approved (Next Generation OLTP workload) 2009 : First TPC Technology Conference on Performance Evaluation & Benchmark (Held in conjunction with VLDB) 2012 : TPC-VMS Approved (1st Virtualized Database benchmark) 2014 : TPCx-HS Approved (1st Hadoop based benchmark) 2015 : TPC-DS Approved (Next Generation Decision Support Benchmark) 2016 : TPCx-V Approved (Virtual DB Server benchmark) 2016 : TPCx-BB Published (Hadoop Big Data benchmark) The Call for an Industry Standard 5
  • 6.
    o The Good: –Established the rules to the game – For the first time, competitive performance claims could be compared – Audited results – Standard workloads focused the industry and drove performance improvements Transaction Processing Performance Council 6 o The Bad: – Expensive to play – “Benchmarketing” and gamesmanship – Dominated by vendors • Hardware vendors • Database vendors – Slow to evolve to a changing marketplace.
  • 7.
    The Impact ofthe TPC 7
  • 8.
    What happened tothe TPC? 8
  • 9.
    o Sysbench – Opensource toolkit • Moderated by Alexey Kopytov at Percona – Implements multiple workloads designed to test the CPU, disk, memory, and database capabilities of a system – Database workload allows for a mixture of reads (singleton selects and range queries) and writes (inserts, updates, and deletes) – Sysbench is popular in the Mysql Marketplace Open Source Database Benchmarking Tools 9 “sysbench is a modular, cross-platform and multi- threaded benchmark tool for evaluating OS parameters that are important for a system running a database under intensive load.”
  • 10.
    o YCSB (Yahoo! CloudServing Benchmark) – Open source toolkit • Moderated by Brian Cooper at Google – Multi-threaded driver exercising get and put operations against an object store – YCSB is popular with the NoSQL Marketplace Open Source Database Benchmarking Tools 10 “YCSB is a framework and common set of workloads for evaluating the performance of different “key-value” and “cloud” serving stores.”
  • 11.
    o Others – “TPC-like”workloads live on • DBT2 (MySQL Benchmark Tool Kit) • DebitCredit (TPC-A / TPC-B like) • OrderEntry (TPC-C like) – OLTPBench • https://github.com/oltpbenchmark/oltpbench – Others? Open Source Database Benchmarking Tools 11
  • 12.
    How to EvaluateDatabase Performance o #1 – Know your objective. – What are you trying to measure/test? • OLTP Performance: Capacity and Throughput • Data Analytics: Query Performance • Do you need full ACID property transactions? • … And other questions? 12
  • 13.
    How to EvaluateDatabase Performance o #2 – Choose your approach. Option 1: Rely on “published Results” • Words of advice: Trust but verify! • Be wary of competitive benchmark claims • Without the TPC, there is no standard for comparison Option 2: Leverage open source benchmarks • Leverage Sysbench and/or YCSB mixed workloads • Create your own custom mix as appropriate Option 3: Model your own workload (“Proof of Concept”) • Particularly useful if you have existing data and existing query profiles 13
  • 14.
    How to EvaluateDatabase Performance o #3 – A data point is not a curve. (Common mistake.) Performance Curves 14 Throughput Latency(ResponseTime) Performance is a tradeoff of throughput versus latency. Design your tests with a variable in mind.
  • 15.
    How to EvaluateDatabase Performance o #4 – Understand where there’s a bottleneck. (Common mistake.) – Where systems can bottleneck: • Hardware (CPU, disk, network) • Database (internal locks/latches, buffer managers, transaction managers, …) • Application (data locks) • Driver systems 15
  • 16.
    How to EvaluateDatabase Performance o #5 – Limit the number of variables. In any test, there are: – Three fundamental system variables • Hardware, operating system, and database system – Driver mode • On box versus external client – Database design variables • Connectivity (odbc, jdbc, session pools, proprietary techniques, …) • Execution model (session-less, prepare/exec, stored procedures, …) • # of tables, # of columns, types of columns – Multiple test variables • Database scale size • Concurrent sessions/streams • Query Complexity 16 Real performance work is an exercise of control.
  • 17.
    How to EvaluateDatabase Performance o #6 – Scalability testing: The quest for “linear scalability” 17 A workload will scale only if sized appropriately.
  • 18.
    How to EvaluateDatabase Performance o #7 – The myth of “representable workloads.” Benchmarks are not “representable workloads” – The complexity of the benchmark does not determine the goodness of the benchmark. 18 Good benchmark performance is “necessary but not sufficient” for good application performance.
  • 19.
    How I Usethe Tools Available o To access “system” health, I use: – CPU Processor: sysbench cpu – Disk Subsystem: sysbench fileio – Network Subsystem: iperf o To access “database” health, I use: – Sysbench for ACID transactions: point selects, point updates, and simple mixes (90:10 or 80:20) – YCSB for nonACID transactions: workloadc (readonly), workloada and workloadb (read/write mixes) o To access “database” transaction capabilities, I use: – DebitCredit and OrderEntry (“TPC like database only workloads.”) o To model application specific problems I will sometimes leverage any or all of the above. 19