MARIADB Clustrix Overview Robbie Mihalyi Matthew White Peter Friedenbach VP Engineering – Clustrix Director of Engineering Performance Architect
Scaling
Application Scaling  Scales Capacity  Provides Fault Tolerance  Dynamic and Flexible DBMS Server Load Balancer
Clustrix Scaling  Scales Reads & Writes  Provides Fault Tolerance  Dynamic and Flexible Clustrix Load Balancer MaxScale
SCALE-OUT RDBMS: CLUSTRIX ● Built from the ground-up - Shared nothing architecture ● All nodes are equal – any node can accept connections and can Read / Write ● Fine grained data distribution with data protection – built-in fault tolerance ● Distributed parallel query execution
SCALE-OUT RDBMS: CLUSTRIX ● To the application it looks like one Database System ● DDL and DML SQL statements are MySQL/MariaDB compatible ○ MySQL/MariaDB data types (including JSON support) ○ Referential integrity ○ Transactions – ACID compliant ○ Triggers and Stored Procedures ○ Complex Joins
SCALE-OUT RDBMS: CLUSTRIX ● Administration Features and Capabilities ○ On-line Schema changes ○ Replication ○ On-line Backup / Restore ● Multi-zone scale and availability
MARIADB Clustrix Technology Matthew White Director of Engineering - Clustrix MariaDB Corporation
Clustrix Technology DATA DISTRIBUTION
Terminology - REPRESENTATION 26 BASE REPRESENTATION Primary Key ID col1 col2 col3 1 16 36 JANUARY 2 17 35 FEBRUARY 3 18 34 MARCH 4 19 33 APRIL 5 20 32 MAY K1 REP Index (col2) col2 ID 32 5 33 4 34 3 35 2 36 1 K2 REP Index col(3, 1) col3 col1 ID APRIL 19 4 FEBRUARY 17 2 JANUARY 16 1 MARCH 18 3 MAY 20 5
Terminology - SLICES Slicing ● Each representation is split into slices ● The hash of keys mapped to slices ● Distribute representations independently Independent Key Distribution ● Adapts to diverse access patterns ● Allows for broader range query evaluation ● Query plans scale with node count
Terminology - REPLICAS Fault Tolerance ● Each slice has one or more copies ○ One node will never contain two replicas for the same slice ● Replicas created on-line without blocking writes ● Default replicas is 2 ● Can have replicas = number of nodes ○ Great for read-heavy tables ● Support for Availability Zones
ClustrixDB S1 S2 S2 S3 S3 S4 S4 S5 S5 Dynamic Data Distribution ● Tables auto-split into slices ● Every slice has a replica on another node ○ Slices are auto distributed, auto-protected S1 BillionsofRows Database Tables
ClustrixDB S1 S2 S3 S3 S4 S4 S5 Adding Nodes – Flex Up ● Easy and simple Flex Up & Flex Down ○ Single minimal interruption of service ● Data is automatically rebalanced across the cluster ○ Tables are online for reads & writes ● All servers handle writes + reads ○ Workload is spread across more servers after Flex Up ClustrixDB Overview 30 S1 ClustrixDB S2 S5
ClustrixDB S2 S5 S1 S2 S3 S3 S4 S4 S5 Loss of a Node – Automatic Fault Tolerance ● ClustrixDB detects the loss of a node ○ System automatically re-protects ○ Data is automatically redistributed ● Slices lost on the failed node are rapidly re-protected ○ Re-protection while tables are available for reads & writes ● Automated self-healing ○ After re-protect the cluster is fully protected and operational ClustrixDB Overview 31 S1 ClustrixDB S2 S5 S2 S5
Complexity Simplified Q: How do you ensure data stays well distributed in a clustered environment? • Initial Data Distributes the data into even slices across nodes • Failed Nodes Re-protects slices to ensure proper replicas exist • Data Growth Splits large slices into smaller slices • Flex-Up/Flex-Down Moves slices to leverage new nodes and/or evacuate nodes • Skewed Data Re-distributes the data to even out across nodes • Hot Slice Balancing Finds hot slices and balances then across nodes • Availability Zone Support Rebalancer will distribute replicas across zones …while the DB stays open for reads & writesPatent 8,543,538 Patent 8,554,726 A: You let the REBALANCER handle it!
Clustrix Technology QUERY PROCESSING
Query Processing Model SQL-based Applications HW or SW Load Balancer ● Load balancer spreads DB connections to all nodes ● A session is established on any node ● Session controls query execution  Parse  Plan  Compile UPDATE users SET online = 1 WHERE id = 8797; Session VM VM ID: 8797 |…| ONLINE:0 Fragment VM
Query Processing Model SQL-based Applications HW or SW Load Balancer ● Load balancer spreads DB connections to all nodes ● A session is established on any node ● Session controls query execution ○ Parse SQL ○ Generate the execution plan ○ Compile into fragments ○ Start the execution of initial fragment ○ Send fragments ○ Coordinate transaction completion GTM UPDATE users SET online = 1 WHERE id = 8797; Session VM VM ID: 8797 |…| ONLINE:1 ID: 8797 |…| ONLINE:1 Commit Commit Commit VM
Parallel Query Processing SQL-based Applications HW or SW Load Balancer VM VM SELECT SUM(amount) FROM donations;  Parse  Plan  Compile Session VM ● Load balancer spreads DB connections to all nodes ● A session is established on any node ● Session controls query execution ○ Parse SQL ○ Generate the execution plan ○ Compile into fragments ○ Lookup record(s) location ○ Send fragments ○ Coordinate transaction completion ● Aggregate in parallel on all nodes FragmentFragment
Parallel Query Processing SQL-based Applications VM VM Session VM ● Load balancer spreads DB connections to all nodes ● A session is established on any node ● Session controls query execution ○ Parse SQL ○ Generate the execution plan ○ Compile into fragments ○ Lookup record(s) location ○ Send fragments ○ Coordinate transaction completion ● Aggregate in parallel on all nodes AGGREGATE LOCALLY AGGREGATE LOCALLY AGGREGATE LOCALLY AGGREGATE RESULTS SELECT SUM(amount) FROM donations; HW or SW Load Balancer
Clustrix Technology ADMINISTRATION
Table Online Schema Change ● Allows reads & writes during ALTER TABLE operations Table QueueQueueQueue MYTABLE __building_MYTABLE Atomic Flip Reads & Writes ALTER TABLE mytable ADD (foo int);  Queue is created to track changes  Table is copied with new column  Queues are applied to new copy until empty  Atomic flip to new table
Table Online Schema Change ● Allows reads & writes during ALTER TABLE operations Table MYTABLE__retiring_MYTABLE Atomic Flip Reads & Writes ALTER TABLE mytable ADD (foo int);   Queue is created to track changes  Table is copied with new column  Queues are applied to new copy until empty  Atomic flip to new table  Original table is deleted
Backup, Replication & Disaster Recovery Asynchronous multi-point Replication ClustrixDB Parallel Backup Replicate to any cloud, any datacenter, anywhere
ClustrixGUI: Performance Monitoring
MARIADB Clustrix Live Demo Peter Friedenbach Performance Architect MariaDB Corporation
THANK YOU!

ClustrixDB: how distributed databases scale out

  • 1.
    MARIADB Clustrix Overview Robbie MihalyiMatthew White Peter Friedenbach VP Engineering – Clustrix Director of Engineering Performance Architect
  • 2.
  • 3.
    Application Scaling  Scales Capacity Provides Fault Tolerance  Dynamic and Flexible DBMS Server Load Balancer
  • 4.
    Clustrix Scaling  ScalesReads & Writes  Provides Fault Tolerance  Dynamic and Flexible Clustrix Load Balancer MaxScale
  • 5.
    SCALE-OUT RDBMS: CLUSTRIX ●Built from the ground-up - Shared nothing architecture ● All nodes are equal – any node can accept connections and can Read / Write ● Fine grained data distribution with data protection – built-in fault tolerance ● Distributed parallel query execution
  • 6.
    SCALE-OUT RDBMS: CLUSTRIX ●To the application it looks like one Database System ● DDL and DML SQL statements are MySQL/MariaDB compatible ○ MySQL/MariaDB data types (including JSON support) ○ Referential integrity ○ Transactions – ACID compliant ○ Triggers and Stored Procedures ○ Complex Joins
  • 7.
    SCALE-OUT RDBMS: CLUSTRIX ●Administration Features and Capabilities ○ On-line Schema changes ○ Replication ○ On-line Backup / Restore ● Multi-zone scale and availability
  • 8.
    MARIADB Clustrix Technology Matthew White Directorof Engineering - Clustrix MariaDB Corporation
  • 9.
  • 10.
    Terminology - REPRESENTATION 26 BASEREPRESENTATION Primary Key ID col1 col2 col3 1 16 36 JANUARY 2 17 35 FEBRUARY 3 18 34 MARCH 4 19 33 APRIL 5 20 32 MAY K1 REP Index (col2) col2 ID 32 5 33 4 34 3 35 2 36 1 K2 REP Index col(3, 1) col3 col1 ID APRIL 19 4 FEBRUARY 17 2 JANUARY 16 1 MARCH 18 3 MAY 20 5
  • 11.
    Terminology - SLICES Slicing ●Each representation is split into slices ● The hash of keys mapped to slices ● Distribute representations independently Independent Key Distribution ● Adapts to diverse access patterns ● Allows for broader range query evaluation ● Query plans scale with node count
  • 12.
    Terminology - REPLICAS FaultTolerance ● Each slice has one or more copies ○ One node will never contain two replicas for the same slice ● Replicas created on-line without blocking writes ● Default replicas is 2 ● Can have replicas = number of nodes ○ Great for read-heavy tables ● Support for Availability Zones
  • 13.
    ClustrixDB S1 S2 S2 S3 S3 S4 S4 S5 S5 Dynamic DataDistribution ● Tables auto-split into slices ● Every slice has a replica on another node ○ Slices are auto distributed, auto-protected S1 BillionsofRows Database Tables
  • 14.
    ClustrixDB S1 S2 S3 S3 S4 S4 S5 Adding Nodes –Flex Up ● Easy and simple Flex Up & Flex Down ○ Single minimal interruption of service ● Data is automatically rebalanced across the cluster ○ Tables are online for reads & writes ● All servers handle writes + reads ○ Workload is spread across more servers after Flex Up ClustrixDB Overview 30 S1 ClustrixDB S2 S5
  • 15.
    ClustrixDB S2 S5 S1 S2 S3 S3 S4 S4 S5 Loss of aNode – Automatic Fault Tolerance ● ClustrixDB detects the loss of a node ○ System automatically re-protects ○ Data is automatically redistributed ● Slices lost on the failed node are rapidly re-protected ○ Re-protection while tables are available for reads & writes ● Automated self-healing ○ After re-protect the cluster is fully protected and operational ClustrixDB Overview 31 S1 ClustrixDB S2 S5 S2 S5
  • 16.
    Complexity Simplified Q: Howdo you ensure data stays well distributed in a clustered environment? • Initial Data Distributes the data into even slices across nodes • Failed Nodes Re-protects slices to ensure proper replicas exist • Data Growth Splits large slices into smaller slices • Flex-Up/Flex-Down Moves slices to leverage new nodes and/or evacuate nodes • Skewed Data Re-distributes the data to even out across nodes • Hot Slice Balancing Finds hot slices and balances then across nodes • Availability Zone Support Rebalancer will distribute replicas across zones …while the DB stays open for reads & writesPatent 8,543,538 Patent 8,554,726 A: You let the REBALANCER handle it!
  • 17.
  • 18.
    Query Processing Model SQL-based Applications HWor SW Load Balancer ● Load balancer spreads DB connections to all nodes ● A session is established on any node ● Session controls query execution  Parse  Plan  Compile UPDATE users SET online = 1 WHERE id = 8797; Session VM VM ID: 8797 |…| ONLINE:0 Fragment VM
  • 19.
    Query Processing Model SQL-based Applications HWor SW Load Balancer ● Load balancer spreads DB connections to all nodes ● A session is established on any node ● Session controls query execution ○ Parse SQL ○ Generate the execution plan ○ Compile into fragments ○ Start the execution of initial fragment ○ Send fragments ○ Coordinate transaction completion GTM UPDATE users SET online = 1 WHERE id = 8797; Session VM VM ID: 8797 |…| ONLINE:1 ID: 8797 |…| ONLINE:1 Commit Commit Commit VM
  • 20.
    Parallel Query Processing SQL-based Applications HWor SW Load Balancer VM VM SELECT SUM(amount) FROM donations;  Parse  Plan  Compile Session VM ● Load balancer spreads DB connections to all nodes ● A session is established on any node ● Session controls query execution ○ Parse SQL ○ Generate the execution plan ○ Compile into fragments ○ Lookup record(s) location ○ Send fragments ○ Coordinate transaction completion ● Aggregate in parallel on all nodes FragmentFragment
  • 21.
    Parallel Query Processing SQL-based Applications VMVM Session VM ● Load balancer spreads DB connections to all nodes ● A session is established on any node ● Session controls query execution ○ Parse SQL ○ Generate the execution plan ○ Compile into fragments ○ Lookup record(s) location ○ Send fragments ○ Coordinate transaction completion ● Aggregate in parallel on all nodes AGGREGATE LOCALLY AGGREGATE LOCALLY AGGREGATE LOCALLY AGGREGATE RESULTS SELECT SUM(amount) FROM donations; HW or SW Load Balancer
  • 22.
  • 23.
    Table Online Schema Change ●Allows reads & writes during ALTER TABLE operations Table QueueQueueQueue MYTABLE __building_MYTABLE Atomic Flip Reads & Writes ALTER TABLE mytable ADD (foo int);  Queue is created to track changes  Table is copied with new column  Queues are applied to new copy until empty  Atomic flip to new table
  • 24.
    Table Online Schema Change ●Allows reads & writes during ALTER TABLE operations Table MYTABLE__retiring_MYTABLE Atomic Flip Reads & Writes ALTER TABLE mytable ADD (foo int);   Queue is created to track changes  Table is copied with new column  Queues are applied to new copy until empty  Atomic flip to new table  Original table is deleted
  • 25.
    Backup, Replication &Disaster Recovery Asynchronous multi-point Replication ClustrixDB Parallel Backup Replicate to any cloud, any datacenter, anywhere
  • 26.
  • 27.
    MARIADB Clustrix Live Demo PeterFriedenbach Performance Architect MariaDB Corporation
  • 28.

Editor's Notes

  • #2 Title Slide for OpenWorks
  • #3 Note that DB in MariaDB is not bolded
  • #13 Note that DB in MariaDB is not bolded
  • #25 Title Slide for OpenWorks
  • #26 Note that DB in MariaDB is not bolded
  • #28 Simple queries Fielded by any node Routed to data node Complex queries Split into query fragments Process fragments in parallel
  • #29 Simple queries Fielded by any node Routed to data node Complex queries Split into query fragments Process fragments in parallel
  • #30 Simple queries Fielded by any node Routed to data node Complex queries Split into query fragments Process fragments in parallel
  • #34 Note that DB in MariaDB is not bolded
  • #39 Note that DB in MariaDB is not bolded
  • #42 Clustrix support MySQL replication both as master and slave – so you can replicate both ways. Within a cluster we saw earlier that all data has multiple copies For Disaster Recovery (when a whole region loses power) Clustrix has 2 options Fast Parallel Backup – This is in addition to slower MySqlDump backup Fast Parallel Replication – This is asynchronous across two Clustrix Clusters
  • #44 Title Slide for OpenWorks
  • #45 OpenWorks End Slide