6

Greetings experts,

On my dedicated CentOS 5.4 server, I configure apache with about a dozen virtual hosts. I test a few of 'em, each loads within about a second; fairly quick. Load average is less than 1. No problems. I'm running static HTML sites, one WordPress blog with MySQL 5.0... these are not high-bandwidth sites; nothing that would stress this server.

Next morning, I get in to work, load up the main site, and it takes 10 to 20 seconds to load. I check the load average on the server and it's hovering around 3, sometimes up to 5, once saw it at 8, never below 2. At this point I gracefully bounce apache:

# apachectl -k graceful 

Takes about half a minute, then all is well again. All virtual hosts load fast, less than a second. Load average quickly sinks below 1.

When checking /server-status, not a lot is going on; when checking net traffic (vnstat -l or vnstat -h), not a lot of bandwidth is being used. Both are compariable at the beginning of the day as at the end. Yet, when I check it in the morning, apache is much, much slower than pretty much all day. What is happening overnight to make apache slow down so much and consume so many more system resources?

# httpd -V Server version: Apache/2.2.3 # uname -a Linux myserver.com 2.6.18-92.el5 #1 SMP Tue Jun 10 18:51:06 EDT 2008 x86_64 x86_64 x86_64 GNU/Linux # free total used free shared buffers cached Mem: 1025576 1017292 8284 0 8208 43160 -/+ buffers/cache: 965924 59652 Swap: 2096472 361012 1735460 

I suppose I could set up a cronjob which gracefully bounced apache daily, but that seems like a quick-and-dirty solution. I'd rather find the cause and fix that.

UPDATE 2009-10-28 14:38; samples taken every 10 seconds over five minutes with average:

$ sar -W 10 30 && date Linux 2.6.18-92.el5 (myserver.com) 10/28/2009 02:32:36 PM pswpin/s pswpout/s 02:32:46 PM 10.31 30.43 02:32:56 PM 2.30 32.93 02:33:06 PM 21.56 0.00 02:33:16 PM 1.80 0.00 02:33:26 PM 5.69 26.67 02:33:36 PM 0.10 0.00 02:33:46 PM 25.70 7.60 02:33:56 PM 10.61 7.11 02:34:06 PM 4.10 2.60 02:34:16 PM 0.70 0.00 02:34:26 PM 0.00 0.00 02:34:36 PM 0.00 0.00 02:34:46 PM 3.80 0.00 02:34:56 PM 0.00 0.00 02:35:06 PM 0.00 11.01 02:35:16 PM 7.70 30.30 02:35:26 PM 20.32 0.00 02:35:36 PM 1.60 0.00 02:35:46 PM 11.60 0.00 02:35:56 PM 2.50 0.00 02:36:06 PM 0.00 0.00 02:36:16 PM 3.60 0.00 02:36:26 PM 0.00 0.00 02:36:36 PM 0.00 0.00 02:36:46 PM 0.00 0.00 02:36:56 PM 445.20 56.60 02:37:06 PM 0.00 0.00 02:37:16 PM 0.00 0.00 02:37:26 PM 0.00 0.00 02:37:36 PM 0.00 0.00 Average: 19.31 6.84 Wed Oct 28 14:37:36 PDT 2009 

Curiously, apache is not slow this morning. I made some tweaks to the number of servers started, num spare servers, max number of servers, etc, yesterday. Let me get the old values and compare...

Original values from /etc/httpd/conf/httpd.conf:

StartServers 20 MinSpareServers 20 MaxSpareServers 120 ServerLimit 256 MaxClients 256 MaxRequestsPerChild 4000 

New values which, from all appearances, seems to work just fine:

StartServers 30 MinSpareServers 30 MaxSpareServers 40 ServerLimit 50 MaxClients 50 MaxRequestsPerChild 4000 

I'll probably continue to tweak these settings a little, but they do seem to work well now.

Sar command again this morning:

$ sar -W 10 30 && date Linux 2.6.18-92.el5 (myserver.com) 10/29/2009 09:31:09 AM pswpin/s pswpout/s 09:31:19 AM 5.80 54.40 09:31:29 AM 62.10 0.00 09:31:39 AM 0.00 0.00 09:31:49 AM 0.00 0.00 09:31:59 AM 0.00 0.00 09:32:09 AM 3.30 0.00 09:32:19 AM 2.70 0.00 09:32:29 AM 0.00 0.00 09:32:39 AM 0.00 0.00 09:32:49 AM 0.00 0.00 09:32:59 AM 3.10 0.00 09:33:09 AM 5.80 0.00 09:33:19 AM 0.00 0.00 09:33:29 AM 0.00 0.00 09:33:39 AM 0.00 0.00 09:33:49 AM 0.00 0.00 09:33:59 AM 0.00 0.00 09:34:09 AM 0.00 0.00 09:34:19 AM 0.00 0.00 09:34:29 AM 0.00 0.00 09:34:39 AM 4.00 0.00 09:34:49 AM 0.10 0.00 09:34:59 AM 0.00 0.00 09:35:09 AM 4.80 0.00 09:35:19 AM 0.00 0.00 09:35:29 AM 291.29 0.00 09:35:39 AM 0.00 0.00 09:35:49 AM 0.80 0.00 09:35:59 AM 0.00 0.00 09:36:09 AM 0.00 0.00 Average: 12.78 1.81 Thu Oct 29 09:36:09 PDT 2009 

The average is actually lower! And the server got more traffic than yesterday. Womble, it seems you were right! And now all is well in the universe again.

John Gardeniers, good idea! It's got the -o [filename] switch just for that. Thanks for the tip!

Jeremy Visser, dstat is a really sweet tool! Thanks for the tip! It was not installed, had to yum install dstat.

2
  • 3
    +1 for finding the cause instead of working around. Excellent practice. Commented Oct 28, 2009 at 19:02
  • I have the same problem. Did you solve the problem? Commented Oct 30, 2014 at 22:03

2 Answers 2

7

Based on your free output, I strongly suspect that your Apache processes are heavily buried in swap. The output of sar -W 1 0 will confirm (or refute) this hypothesis (run it when the machine is running slow).

If the Apache processes aren't all actually serving requests (as shown by mod_status) you should tune the number of "spare" children (with MaxSpareServers) so that they get reaped quicker (and hence don't lay around consuming RAM). If you really do need the number of children you're running to service the request load, you'll need more RAM (I'd go with another 1GB straight up; RAM is cheap, diagnosis time isn't).

4
  • Indeed, I suspected swapping. I've been playing around with the number of children started, max spare, etc. I'll run that command tomorrow and post results. Commented Oct 28, 2009 at 20:19
  • 2
    Why not periodically run the command via cron, with output to a file, so you can also see trends throughout the day? Commented Oct 28, 2009 at 20:56
  • 2
    To confirm this, have a look at dstat. dstat can tell you cool things like live paging information (i.e. how much was retrieved from swap in the last second). Try running dstat in the morning, then fetching a page from your server and seeing what goes haywire (e.g. disk, CPU, net, paging). Commented Oct 29, 2009 at 1:47
  • @Jeremy: sar -W does the same thing. @John: If you want to get professional about it and start collecting comprehensive system performance data, sure... (grin) Commented Oct 29, 2009 at 1:58
3

What's the process(es) that eats up all memory? Try a iostat/vmstat before any apache restarts - could be a I/O problem.

For trend monitoring, I advise using munin/colectd (those have even very useful apache (for you case, especially) plugins).

2
  • Yes Raven007, I do have munin installed and running on the server. I just don't have any modules/plugins for apache yet. Do you know where I can get some? Commented Oct 29, 2009 at 17:54
  • 1
    check it out : muninexchange.projects.linpro.no/?about Commented Oct 29, 2009 at 18:54

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.