MariaDB/MySQL_: Developing Scalable Applications

MariaDB / MySQL: Developing Scalable Applications

$ whoami Hi, I’m Federico Razzoli, founder of Vettabase Database consultant, open source supporter, long time MariaDB and MySQL user vettabase.com federico.razzoli@vettabase.com

● What is High Availability ● Read-Write splitting ● Transactions and locks in a HA environment ● Deadlocks and how to avoid them ● Primary keys and HA All from a perspective of a Backend Developer Agenda

High Availability for databases means that: ● The same database exists on multiple servers ● And any of them can process SQL queries High Availability

This is essential because: ● If a server crashes there is no outage ● The workload can be distributed over multiple servers High Availability

This is essential because: ● If a server crashes there is no outage ● The workload can be distributed over multiple servers This is database scalability High Availability

● RDS replication is asynchronous ● RDS multi-AZ is synchronous ● Ec2 multi-AZ is synchronous ● Aurora has synchronous storage, asynchronous cache High Availability

● Writes must be directed to the master ● Reads, in general, should be directed to a replica. Problems: ○ Some reads can tolerate some lag, others can't ○ Some reads need to "see" a new row immediately after writing it Read/Write Splitting

For reads, ask yourself two questions: ● Can some lag (perhaps < 1 second) be tolerated? ● If so, how much lag can you tolerate? Only developers can take these decisions, and the answer depends on the use case. Examples: ● Ecommerce: read the list of buyable products. No lag tolerated. ● Show how many users are online now. A some minutes lag is ok. ● Show sales tends from 2020 to today. Lag doesn't matter. Read/Write Splitting

Generally you have different hostnames or credentials to send different types of SQL queries: ● Writes and regular reads: a load balancer will split them ● OPTIONAL: Reads that can tolerate a lag ● OPTIONAL: Reads that can tolerate a longer lag Read/Write Splitting

In the simplest case, the load balancer will send: ● INSERT, UPDATE, DELETE to a Master ● SELECT queries to a Replica Read/Write Splitting

Some queries need be sent to the same server and, because the next query might be a write, they're sent to the master: ● Transactions ● Stored procedures / functions ● Queries after CREATE TEMPORARY TABLE ● Queries after a user-variable is created To scale better, try to avoid these things when not really needed Read/Write Splitting

Some balancers (ProxySQL) allow to add a rule to force the query to a Master or a Replica if they contain a comment: /*READ*/ CALL my_procedure(); /*WRITE*/ SELECT x, y, z FROM tbl; Read/Write Splitting

● Use transactions to make sure that a series of queries is sent to the master in the same connection: START TRANSACTION; INSERT … ; SELECT … ; COMMIT; ● Or to the same replica, in the same connection: START TRANSACTION READ ONLY; SELECT … ; SELECT … ; COMMIT; But sometimes, you can do even better: use one SQL statement instead of several of them. Transactions

● Insert row if it doesn't exist: INSERT IGNORE … ; if (mysql_affected_rows() > 0) … ● Delete row if it doesn't exist: DELETE IGNORE … ; if (mysql_affected_rows() > 0) … ● Insert or update: INSERT (email, phone) VALUES ('abc@hey.com', '12345') ON DUPLICATE KEY UPDATE email = 'abc@hey.com', phone = '12345'; Transactions

● Insert + select (many ORMs do this): INSERT INTO user ... RETURNING id, email; if (mysql_affected_rows() > 0) … ● Delete + select: DELETE FROM user WHERE id IN (1, 2, 3) RETURNING id, email; if (mysql_affected_rows() > 0) … Transactions

● Insert multiple rows with 1 INSERT INSERT INTO tbl (a, b, c) VALUES (10, 20, 30), (11, 22, 33), … ; ● Delete or update from multiple tables at once: DELETE t1, t2 FROM t1 INNER JOIN t2 WHERE t1.id = t2.id AND t1.x = 1; Transactions

● If you don't use multi-statement transactions, set autocommit=1 ● Some statements (especially SET) can confuse the balancer, or even be rejected. If it happens, you can put them, and the following queries, into stored procedures Transactions

Remember to keep transactions short (when possible): ● In terms of rows read ● In terms of time elapsed Though, a mass insert/update/delete is faster if it happens in one transaction Transactions

● Data in a transactional database are versioned ● They are updated asynchronously in the data files ● Transaction logs contain the information to update data or to restore reasonably old versions (ROLLBACK) Locks

● Writes acquire locks on modified rows ● Reads acquire a shared lock and a view on data Locks

To reduce deadlocks: ● Reduce foreign keys ● Keep transactions short ● Transactions that read the same table should do it always in the same order ● Use isolation levels wisely Extreme measure to make deadlocks impossible: ● Set innodb_lock_wait_timeout=0 Locks

A transaction uses a transaction isolation level: ● REPEATABLE READ All queries in the transaction use the same view, acquired at transaction start ● READ COMMITTED Each query acquires a new data view ● READ UNCOMMITTED Similar to READ COMMITTED, but queries can see uncommitted data Locks

Example: SET TRANSACTION ISOLATION LEVEL READ COMMITTED; START TRANSACTION; … COMMIT; Locks

AUTO_INCREMENT primary keys are common but have some drawbacks: ● Shouldn't be used to sort chronologically (failover) ● Numbers have a limit, even with BIGINT UNSIGNED the limit can be reached ● An AUTO_INCREMENT lock governs the primary keys ● Easy to guess Locks

● Use UUID instead: CREATE OR REPLACE TABLE employee ( uuid UUID DEFAULT UUID() PRIMARY KEY, ... ); Locks

MariaDB/MySQL_: Developing Scalable Applications

More Related Content

Similar to MariaDB/MySQL_: Developing Scalable Applications

More from Federico Razzoli

Recently uploaded

MariaDB/MySQL_: Developing Scalable Applications