0

I have simple master-slave replication setup. Everything worked with no problems for more than a year.

Few days ago replication lag started to rise and it's still rising with no obvious reason.

Seconds_Behind_Master is now more than 4 days and 6 hours.

Both SQL and IO slave threads are working, there is no problem with communication.

SHOW SLAVE STATUS\G *************************** 1. row *************************** Slave_IO_State: Waiting for master to send event Master_Host: s1.my.pl Master_User: root Master_Port: 3306 Connect_Retry: 60 Master_Log_File: mysql-bin.001674 Read_Master_Log_Pos: 10948252 Relay_Log_File: mysqld-relay-bin.000509 Relay_Log_Pos: 63370742 Relay_Master_Log_File: mysql-bin.001638 Slave_IO_Running: Yes Slave_SQL_Running: Yes Replicate_Do_DB: Replicate_Ignore_DB: Replicate_Do_Table: Replicate_Ignore_Table: Replicate_Wild_Do_Table: Replicate_Wild_Ignore_Table: Last_Errno: 0 Last_Error: Skip_Counter: 0 Exec_Master_Log_Pos: 63370596 Relay_Log_Space: 3523067396 Until_Condition: None Until_Log_File: Until_Log_Pos: 0 Master_SSL_Allowed: No Master_SSL_CA_File: Master_SSL_CA_Path: Master_SSL_Cert: Master_SSL_Cipher: Master_SSL_Key: Seconds_Behind_Master: 371329 Master_SSL_Verify_Server_Cert: No Last_IO_Errno: 0 Last_IO_Error: Last_SQL_Errno: 0 Last_SQL_Error: Replicate_Ignore_Server_Ids: Master_Server_Id: 1 1 row in set (0.00 sec) 

Show full processlist produces strange output in one row "exit mysqld_lock_tables()", I'm not sure what it means

 SHOW FULL PROCESSLIST; +-----+-------------+-----------+------+---------+--------+----------------------------------+-----------------------+-----------+---------------+-----------+ | Id | User | Host | db | Command | Time | State | Info | Rows_sent | Rows_examined | Rows_read | +-----+-------------+-----------+------+---------+--------+----------------------------------+-----------------------+-----------+---------------+-----------+ | 808 | system user | | NULL | Connect | 892 | Waiting for master to send event | NULL | 0 | 0 | 1 | | 809 | system user | | NULL | Connect | 371477 | exit mysqld_lock_tables() | NULL | 0 | 0 | 1 | | 810 | root | localhost | NULL | Query | 0 | NULL | SHOW FULL PROCESSLIST | 0 | 0 | 401 | +-----+-------------+-----------+------+---------+--------+----------------------------------+-----------------------+-----------+---------------+-----------+ 

SHOW_OPEN_TABLES command shows that no table is locked

Any hints what can I do to fix this or how to get more detailed diagnosis?

2
  • Any ideas how to show exact query that causes lag? Commented Jan 15, 2013 at 18:39
  • This might be due to UPDATE/DELETE queries which are causing delay due to not using index or taking too much time to update. You need to enable slow query log (dev.mysql.com/doc/refman/5.1/en/slow-query-log.html). BTW, do you see any long running query in processlist? Commented Jan 16, 2013 at 10:01

1 Answer 1

2

One of the problems with MySQL replication is the serialization of I/O in the slave. While the master performs several I/O tasks at once, they get written to a single relay log, which is then executed by the slave, hence serializing the I/O. Therefore your slave should always be capable of more I/O throughput than your master.

To see if you've actually got a I/O problem, install ioprompt (homepage).

In your case, there could be another problem, you might have run into a bug. Have you tried

stop slave; start slave; 

to see if that fixes the problem? Next thing try stopping and starting your MySQL slave process.

4
  • Yes, I tried restarting both replication slave and mysqld itself with no success. iotop shows: Total DISK READ: 13.95 M/s | Total DISK WRITE: 2022.81 B/s Disk array is hardware SAS RAID, it can handle more than 10x of such load Commented Jan 15, 2013 at 18:36
  • @Matthias Are there any other processes showing up in iotop who seem to be waiting for i/o? You have to watch iotop for a while to get a feeling for it. Also use the "o" key to only show i/o-active processes. Commented Jan 15, 2013 at 18:41
  • It's spare/backup server. There is nothing more running there, even no SQL SELECT queries. MySQL replication slave is only process making disk IO Commented Jan 15, 2013 at 20:32
  • I didn't see ioprompt in the RHEL repo, but I was able to get similar numbers from atop. It was nice how it highlighted the disk utilization in red when it was a problem. Commented Nov 4, 2013 at 14:35

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.