ClustrixDB: how distributed databases scale out

MARIADB Clustrix Overview Robbie Mihalyi Matthew White Peter Friedenbach VP Engineering – Clustrix Director of Engineering Performance Architect

Application Scaling  Scales Capacity  Provides Fault Tolerance  Dynamic and Flexible DBMS Server Load Balancer

Clustrix Scaling  Scales Reads & Writes  Provides Fault Tolerance  Dynamic and Flexible Clustrix Load Balancer MaxScale

SCALE-OUT RDBMS: CLUSTRIX ● Built from the ground-up - Shared nothing architecture ● All nodes are equal – any node can accept connections and can Read / Write ● Fine grained data distribution with data protection – built-in fault tolerance ● Distributed parallel query execution

SCALE-OUT RDBMS: CLUSTRIX ● To the application it looks like one Database System ● DDL and DML SQL statements are MySQL/MariaDB compatible ○ MySQL/MariaDB data types (including JSON support) ○ Referential integrity ○ Transactions – ACID compliant ○ Triggers and Stored Procedures ○ Complex Joins

SCALE-OUT RDBMS: CLUSTRIX ● Administration Features and Capabilities ○ On-line Schema changes ○ Replication ○ On-line Backup / Restore ● Multi-zone scale and availability

MARIADB Clustrix Technology Matthew White Director of Engineering - Clustrix MariaDB Corporation

Clustrix Technology DATA DISTRIBUTION

Terminology - REPRESENTATION 26 BASE REPRESENTATION Primary Key ID col1 col2 col3 1 16 36 JANUARY 2 17 35 FEBRUARY 3 18 34 MARCH 4 19 33 APRIL 5 20 32 MAY K1 REP Index (col2) col2 ID 32 5 33 4 34 3 35 2 36 1 K2 REP Index col(3, 1) col3 col1 ID APRIL 19 4 FEBRUARY 17 2 JANUARY 16 1 MARCH 18 3 MAY 20 5

Terminology - SLICES Slicing ● Each representation is split into slices ● The hash of keys mapped to slices ● Distribute representations independently Independent Key Distribution ● Adapts to diverse access patterns ● Allows for broader range query evaluation ● Query plans scale with node count

Terminology - REPLICAS Fault Tolerance ● Each slice has one or more copies ○ One node will never contain two replicas for the same slice ● Replicas created on-line without blocking writes ● Default replicas is 2 ● Can have replicas = number of nodes ○ Great for read-heavy tables ● Support for Availability Zones

ClustrixDB S1 S2 S2 S3 S3 S4 S4 S5 S5 Dynamic Data Distribution ● Tables auto-split into slices ● Every slice has a replica on another node ○ Slices are auto distributed, auto-protected S1 BillionsofRows Database Tables

ClustrixDB S1 S2 S3 S3 S4 S4 S5 Adding Nodes – Flex Up ● Easy and simple Flex Up & Flex Down ○ Single minimal interruption of service ● Data is automatically rebalanced across the cluster ○ Tables are online for reads & writes ● All servers handle writes + reads ○ Workload is spread across more servers after Flex Up ClustrixDB Overview 30 S1 ClustrixDB S2 S5

ClustrixDB S2 S5 S1 S2 S3 S3 S4 S4 S5 Loss of a Node – Automatic Fault Tolerance ● ClustrixDB detects the loss of a node ○ System automatically re-protects ○ Data is automatically redistributed ● Slices lost on the failed node are rapidly re-protected ○ Re-protection while tables are available for reads & writes ● Automated self-healing ○ After re-protect the cluster is fully protected and operational ClustrixDB Overview 31 S1 ClustrixDB S2 S5 S2 S5

Complexity Simplified Q: How do you ensure data stays well distributed in a clustered environment? • Initial Data Distributes the data into even slices across nodes • Failed Nodes Re-protects slices to ensure proper replicas exist • Data Growth Splits large slices into smaller slices • Flex-Up/Flex-Down Moves slices to leverage new nodes and/or evacuate nodes • Skewed Data Re-distributes the data to even out across nodes • Hot Slice Balancing Finds hot slices and balances then across nodes • Availability Zone Support Rebalancer will distribute replicas across zones …while the DB stays open for reads & writesPatent 8,543,538 Patent 8,554,726 A: You let the REBALANCER handle it!

Clustrix Technology QUERY PROCESSING

Query Processing Model SQL-based Applications HW or SW Load Balancer ● Load balancer spreads DB connections to all nodes ● A session is established on any node ● Session controls query execution  Parse  Plan  Compile UPDATE users SET online = 1 WHERE id = 8797; Session VM VM ID: 8797 |…| ONLINE:0 Fragment VM

Query Processing Model SQL-based Applications HW or SW Load Balancer ● Load balancer spreads DB connections to all nodes ● A session is established on any node ● Session controls query execution ○ Parse SQL ○ Generate the execution plan ○ Compile into fragments ○ Start the execution of initial fragment ○ Send fragments ○ Coordinate transaction completion GTM UPDATE users SET online = 1 WHERE id = 8797; Session VM VM ID: 8797 |…| ONLINE:1 ID: 8797 |…| ONLINE:1 Commit Commit Commit VM

Parallel Query Processing SQL-based Applications HW or SW Load Balancer VM VM SELECT SUM(amount) FROM donations;  Parse  Plan  Compile Session VM ● Load balancer spreads DB connections to all nodes ● A session is established on any node ● Session controls query execution ○ Parse SQL ○ Generate the execution plan ○ Compile into fragments ○ Lookup record(s) location ○ Send fragments ○ Coordinate transaction completion ● Aggregate in parallel on all nodes FragmentFragment

Parallel Query Processing SQL-based Applications VM VM Session VM ● Load balancer spreads DB connections to all nodes ● A session is established on any node ● Session controls query execution ○ Parse SQL ○ Generate the execution plan ○ Compile into fragments ○ Lookup record(s) location ○ Send fragments ○ Coordinate transaction completion ● Aggregate in parallel on all nodes AGGREGATE LOCALLY AGGREGATE LOCALLY AGGREGATE LOCALLY AGGREGATE RESULTS SELECT SUM(amount) FROM donations; HW or SW Load Balancer

Clustrix Technology ADMINISTRATION

Table Online Schema Change ● Allows reads & writes during ALTER TABLE operations Table QueueQueueQueue MYTABLE __building_MYTABLE Atomic Flip Reads & Writes ALTER TABLE mytable ADD (foo int);  Queue is created to track changes  Table is copied with new column  Queues are applied to new copy until empty  Atomic flip to new table

Table Online Schema Change ● Allows reads & writes during ALTER TABLE operations Table MYTABLE__retiring_MYTABLE Atomic Flip Reads & Writes ALTER TABLE mytable ADD (foo int);   Queue is created to track changes  Table is copied with new column  Queues are applied to new copy until empty  Atomic flip to new table  Original table is deleted

Backup, Replication & Disaster Recovery Asynchronous multi-point Replication ClustrixDB Parallel Backup Replicate to any cloud, any datacenter, anywhere

ClustrixGUI: Performance Monitoring

MARIADB Clustrix Live Demo Peter Friedenbach Performance Architect MariaDB Corporation

ClustrixDB: how distributed databases scale out

More Related Content

Similar to ClustrixDB: how distributed databases scale out

More from MariaDB plc

Recently uploaded

ClustrixDB: how distributed databases scale out

Editor's Notes