Last Friday (10/7/2011) we started having httpd.worker processes grow from the typical 10-15M memory reserved to 10G+ (gigabytes) in a matter of 1-2 minutes. This obviously brings the server to a halt as it starts swapping etc. We have to reboot the server in order to get it running again. If we catch it in time we can kill the offending httpd.worker and all is well for the time being.
System
- RHEL 5.5
- Apache httpd-2.2.3-45.el5_6.2.x86_64.rpm (Patched to prevent the recent byte range filter vulnerability)
- Using Apache MPM worker (not prefork)
- mod_jk 1.2.28
- mod_rewrite
- OpenSSL (latest red hat version)
- Tomcat / JBoss Web 2.1 (JBoss 5.1.0)
- dedicated servers (unshared), 12 gigs of RAM on each
Symptoms
- Under normal load all of a sudden one httpd.worker process will grow from 10M to several Gigs in reserved memory. Have to kill -9 the process or else server grinds to a halt
- Will occasionally happen to multiple httpd.worker processes around the same time
- Once the offending process(es) has been killed, all is normal again (matter of minutes).
- Has been happening approx. every 8 - 12 hours since last Friday, no clear pattern.
- No spikes in request traffic leading up to it
- No odd traffic / errors in access_log and error_log
Additional notes
- Our normal load is ~5-10 requests / sec on each of server, not crazy.
- We set (after this started) MaxRequestsPerChild to 250 and workers are being properly cycled. Implies that the issue is from a single or small set of requests
- We've made no application / system configuration changes in the last 2 weeks.
- As it's not a sustained issue (goes away in a matter of minutes) it doesn't feel like a
- It sounds exactly like the byte range filter vulnerability, but we've patched and tested for that (https://issues.apache.org/bugzilla/show_bug.cgi?id=51714)
- I've read several posts on server fault (and elsewhere) but haven't found any that describe a single work process going out of control with memory
Questions
- What can cause an individual httpd.worker processes' memory to grow out of control like this? Or even anything beyond the typical amount (10m-15m for our config)?
- Any suggestions for troubleshooting this? We're watching top, server-status, jkstatus, monitoring with cacti, have monit installed and are getting mod_jk logging going.
Apache / mod_jk / Tomcat (JbossWeb) Configuration
From httpd.conf...
<IfModule worker.c> StartServers 2 MaxClients 500 MinSpareThreads 25 MaxSpareThreads 150 ThreadsPerChild 50 MaxRequestsPerChild 250 </IfModule> From mod_jk's worker.properties...
# Define Node1 worker.node1.port=8009 worker.node1.host=127.0.0.1 worker.node1.type=ajp13 worker.node1.lbfactor=1 worker.node1.connection_pool_timeout=60 worker.node1.connection_pool_size=35 worker.node1.connect_timeout=5000 worker.node1.prepost_timeout=5000 From tomcat's server.xml...
<Connector protocol="AJP/1.3" port="8009" address="${jboss.bind.address}" redirectPort="8443" maxThreads="350" connectionTimeout="60000" enableLookups="false"/> Would appreciate any input!