Sometimes the server needs a lot of time for a request. There's also an error that there are too many connections to the mysql server.
I've set the max_connections up to 800 but this doesn't help at all. I've also optimized the database.
Here's a top output when the problem is present.
top - 20:07:48 up 164 days, 13:21, 1 user, load average: 10.70, 12.08, 11.45 Tasks: 186 total, 8 running, 178 sleeping, 0 stopped, 0 zombie Cpu(s): 34.6%us, 6.7%sy, 0.0%ni, 2.0%id, 56.7%wa, 0.0%hi, 0.0%si, 0.0%st Mem: 6291456k total, 1200580k used, 5090876k free, 0k buffers Swap: 0k total, 0k used, 0k free, 0k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 13780 mysql 15 0 122m 70m 4876 S 6 1.2 53:54.73 mysqld 3269 www-data 16 0 49792 26m 3988 S 4 0.4 0:00.36 apache2 3386 www-data 15 0 48016 24m 4032 S 4 0.4 0:00.35 apache2 3385 www-data 16 0 48020 24m 3920 S 3 0.4 0:00.21 apache2 3394 www-data 15 0 48048 24m 3924 S 3 0.4 0:00.20 apache2 3417 www-data 16 0 47984 24m 3960 S 3 0.4 0:00.30 apache2 3226 www-data 15 0 49012 26m 4200 S 3 0.4 0:00.36 apache2 3094 www-data 15 0 48040 24m 4100 S 3 0.4 0:01.33 apache2 3170 www-data 16 0 48028 25m 4180 S 3 0.4 0:00.45 apache2 3192 www-data 15 0 48252 25m 4036 S 3 0.4 0:00.48 apache2 3342 www-data 15 0 47980 24m 4000 S 3 0.4 0:00.36 apache2 3435 www-data 16 0 49008 25m 3844 S 3 0.4 0:00.20 apache2 1242 www-data 16 0 48048 24m 4132 S 2 0.4 0:00.72 apache2 1562 www-data 15 0 50136 27m 4620 S 2 0.4 0:01.66 apache2 3166 www-data 15 0 47980 24m 4012 S 2 0.4 0:00.52 apache2 3169 www-data 16 0 45596 23m 4256 R 2 0.4 0:01.29 apache2 3393 www-data 16 0 45580 22m 4012 R 2 0.4 0:00.21 apache2
What data would also help to identify the problem?
Here's the output from mysql:
mysql> show variables like '%cache%'; +------------------------------+------------+ | Variable_name | Value | +------------------------------+------------+ | binlog_cache_size | 32768 | | have_query_cache | YES | | key_cache_age_threshold | 300 | | key_cache_block_size | 1024 | | key_cache_division_limit | 100 | | max_binlog_cache_size | 4294967295 | | ndb_cache_check_time | 0 | | query_cache_limit | 1048576 | | query_cache_min_res_unit | 4096 | | query_cache_size | 16777216 | | query_cache_type | ON | | query_cache_wlock_invalidate | OFF | | table_cache | 64 | | thread_cache_size | 8 | +------------------------------+------------+
I'll post an output of free and sar when the problem appears again.
According to mysql tuning primer all config vars are OK now, however the wait rate is still quite high. Do you have any hints how to find out what causes the high I/O traffic?
Here's the output of top again:
top - 16:26:32 up 172 days, 9:40, 1 user, load average: 5.29, 4.52, 3.45 Tasks: 35 total, 1 running, 34 sleeping, 0 stopped, 0 zombie Cpu(s): 0.5%us, 2.0%sy, 0.0%ni, 0.0%id, 97.5%wa, 0.0%hi, 0.0%si, 0.0%st Mem: 6291456k total, 122712k used, 6168744k free, 0k buffers Swap: 0k total, 0k used, 0k free, 0k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 14118 mysql 15 0 109m 20m 4520 S 3 0.3 0:01.32 mysqld 1244 root 16 0 35224 19m 3856 D 2 0.3 204:41.52 apache2 1 root 15 0 1960 688 584 S 0 0.0 0:11.83 init 1041 root 15 0 5216 996 640 S 0 0.0 6:13.44 sshd 1064 root 15 0 2328 824 660 S 0 0.0 0:35.44 xinetd 1132 root 15 0 31292 28m 2344 S 0 0.5 0:55.09 spamd 1166 qmails 16 0 1668 508 408 S 0 0.0 2:18.86 qmail-send 1168 qmaill 18 0 1624 476 404 S 0 0.0 0:22.15 splogger 1169 root 15 0 1652 388 292 S 0 0.0 0:02.71 qmail-lspawn 1170 qmailr 15 0 1648 392 296 S 0 0.0 0:11.59 qmail-rspawn 1171 qmailq 18 0 1620 356 288 S 0 0.0 0:12.25 qmail-clean 1292 sw-cp-se 15 0 6448 4208 1460 S 0 0.1 7:39.56 sw-cp-serverd 1456 root 15 0 2008 832 656 S 0 0.0 0:16.97 cron 7561 root 16 0 10180 2892 2324 S 0 0.0 0:00.05 sshd 7563 scirocco 15 0 10180 1648 1064 S 0 0.0 0:00.40 sshd 7564 scirocco 20 0 4508 1972 1364 S 0 0.0 0:00.00 bash 7567 root 18 0 7168 1900 1480 S 0 0.0 0:00.00 su
show processlist;
.free
(andsar
while you are at it) as well?