10

We've recently been hitting our maxclients (300) on our server and are noticing a lot of connections are stuck in the R reading state. Here's an excerpt from our server-status dump:

Current Time: Monday, 29-Apr-2013 11:46:00 PDT Restart Time: Monday, 29-Apr-2013 11:03:48 PDT Parent Server Generation: 0 Server uptime: 42 minutes 12 seconds Total accesses: 144170 - Total Traffic: 29.8 MB CPU Usage: u188.25 s345.65 cu2601.11 cs0 - 124% CPU load 56.9 requests/sec - 12.0 kB/second - 216 B/request 244 requests currently being processed, 56 idle workers KRR_RRRRRRRRRRRR_RRRKRRRR....................................... K_RRK_RK_RKR_RRR_RRRR_R_R....................................... RRRRRRRRRRRRRC_CR___RCRR_....................................... R_RRRCRRRRRRRRCRRRCRRKRCR....................................... R_K_RKR_CRRRRRRKC_RR_KR_R....................................... RRRRKRR__RRRRKRRCR_RRRRRR....................................... _RCRRRRRRRRRKR__RR_RRCRRR....................................... RRRRRRRRCRRRRRRKRRRR_RCRR....................................... R___RRCR__RR__RRRKC___RK_....................................... _R_KCR__RR__R_R_RRRRRR_RK....................................... RRR_RR___RCRRK_R_RWRR_RRC....................................... RRKRCRRRRRRRRKRRRRRRRRCRR....................................... ................................................................ ................................................................ ................................................................ ................................................................ Scoreboard Key: "_" Waiting for Connection, "S" Starting up, "R" Reading Request, "W" Sending Reply, "K" Keepalive (read), "D" DNS Lookup, "C" Closing connection, "L" Logging, "G" Gracefully finishing, "I" Idle cleanup of worker, "." Open slot with no current process Srv PID Acc M CPU SS Req Conn Child Slot Client VHost Request 0-0 10320 1/718/718 K 399.14 0 42 0.0 0.21 0.21 69.171.178.24 mysite.com GET /home.php 0-0 10320 0/434/434 R 398.21 8 40 0.0 0.07 0.07 ? ? ..reading.. 0-0 10320 0/548/548 R 331.52 558 45 0.0 0.12 0.12 ? ? ..reading.. 0-0 10320 0/460/460 _ 399.05 0 42 0.0 0.08 0.08 98.129.101.123 mysite.com GET /home.php 0-0 10320 0/473/473 R 364.89 301 48 0.0 0.09 0.09 ? ? ..reading.. 0-0 10320 0/616/616 R 396.18 26 46 0.0 0.11 0.11 ? ? ..reading.. 0-0 10320 0/494/494 R 378.84 127 48 0.0 0.10 0.10 ? ? ..reading.. 0-0 10320 0/231/231 R 311.62 687 43 0.0 0.04 0.04 ? ? ..reading.. 0-0 10320 0/376/376 R 391.17 76 53 0.0 0.08 0.08 ? ? ..reading.. 0-0 10320 0/377/377 R 380.72 176 53 0.0 0.07 0.07 ? ? ..reading.. 0-0 10320 0/589/589 R 355.07 382 47 0.0 0.10 0.10 ? ? ..reading.. 0-0 10320 0/587/587 R 393.16 56 50 0.0 0.13 0.13 ? ? ..reading.. 0-0 10320 0/561/561 R 393.89 49 40 0.0 0.10 0.10 ? ? ..reading.. 0-0 10320 0/616/616 R 389.88 81 41 0.0 0.22 0.22 ? ? ..reading.. 0-0 10320 0/354/354 R 392.12 59 42 0.0 0.06 0.06 ? ? ..reading.. 0-0 10320 0/346/346 R 377.51 192 44 0.0 0.06 0.06 ? ? ..reading.. 

Why are there so many in that R state? And does the SS column really mean that they have been reading for 558, 301, etc seconds?

Here's some of my server info and settings:

Server version: Apache/2.2.24 (Unix) Server built: Mar 22 2013 10:57:57 Cpanel::Easy::Apache v3.18.9 rev9999 Server's Module Magic Number: 20051115:31 Server loaded: APR 1.4.6, APR-Util 1.4.1 Compiled using: APR 1.4.6, APR-Util 1.4.1 Architecture: 64-bit Server MPM: Worker threaded: yes (fixed thread count) forked: yes (variable process count) Server compiled with.... -D APACHE_MPM_DIR="server/mpm/worker" -D APR_HAS_SENDFILE -D APR_HAS_MMAP -D APR_HAVE_IPV6 (IPv4-mapped addresses enabled) -D APR_USE_SYSVSEM_SERIALIZE -D APR_USE_PTHREAD_SERIALIZE -D SINGLE_LISTEN_UNSERIALIZED_ACCEPT -D APR_HAS_OTHER_CHILD -D AP_HAVE_RELIABLE_PIPED_LOGS -D DYNAMIC_MODULE_LIMIT=128 -D HTTPD_ROOT="/usr/local/apache" -D SUEXEC_BIN="/usr/local/apache/bin/suexec" -D DEFAULT_SCOREBOARD="logs/apache_runtime_status" -D DEFAULT_ERRORLOG="logs/error_log" -D AP_TYPES_CONFIG_FILE="conf/mime.types" -D SERVER_CONFIG_FILE="conf/httpd.conf" StartServers 5 <IfModule prefork.c> MinSpareServers 5 MaxSpareServers 10 </IfModule> ServerLimit 300 MaxClients 300 MaxRequestsPerChild 10000 KeepAlive On KeepAliveTimeout 1 MaxKeepAliveRequests 100 Here are compiled in modules: core.c mod_authn_file.c mod_authn_dbm.c mod_authn_dbd.c mod_authn_default.c mod_authn_alias.c mod_authz_host.c mod_authz_groupfile.c mod_authz_user.c mod_authz_dbm.c mod_authz_default.c mod_auth_basic.c mod_auth_digest.c mod_file_cache.c mod_cache.c mod_disk_cache.c mod_include.c mod_filter.c mod_deflate.c mod_log_config.c mod_log_forensic.c mod_logio.c mod_env.c mod_mime_magic.c mod_expires.c mod_headers.c mod_unique_id.c mod_setenvif.c mod_version.c mod_proxy.c mod_proxy_connect.c mod_proxy_ftp.c mod_proxy_http.c mod_proxy_scgi.c mod_proxy_ajp.c mod_proxy_balancer.c mod_ssl.c worker.c http_core.c mod_mime.c mod_dav.c mod_status.c mod_autoindex.c mod_asis.c mod_info.c mod_suexec.c mod_cgid.c mod_dav_fs.c mod_dav_lock.c mod_negotiation.c mod_dir.c mod_actions.c mod_speling.c mod_userdir.c mod_alias.c mod_rewrite.c mod_so.c 

When we restart the reading requests slowly start building back up every time. Thanks for your help in advance!

UPDATE: here's netstat -pn:

tcp 0 0 198.57.162.52:80 54.235.31.205:41209 TIME_WAIT - tcp 0 0 198.57.162.52:80 166.147.121.170:63246 TIME_WAIT - tcp 0 0 198.57.162.52:80 50.128.171.238:50751 FIN_WAIT2 - tcp 0 0 198.57.162.52:80 89.241.135.84:24825 TIME_WAIT - tcp 0 0 198.57.162.52:80 205.142.167.225:33256 TIME_WAIT - tcp 0 0 198.57.162.52:80 76.127.87.244:56493 TIME_WAIT - tcp 0 0 198.57.162.52:80 205.153.156.222:21571 TIME_WAIT - tcp 0 0 198.57.162.52:80 108.12.147.14:61135 TIME_WAIT - tcp 0 0 198.57.162.52:80 166.217.157.32:1630 TIME_WAIT - tcp 0 0 198.57.162.52:443 71.22.102.71:4252 TIME_WAIT - 

Sample lsof output:

httpd 4092 nobody 25u IPv4 711278095 0t0 TCP 198-57-162-52.unifiedlayer.com:http->4.sub-70-193-66.myvzw.com:12471 (ESTABLISHED) httpd 4092 nobody 26u IPv4 711350284 0t0 TCP 198-57-162-52.unifiedlayer.com:http->c75-111-15-253.amrlcmta01.tx.dh.suddenlink.net:51298 (ESTABLISHED) httpd 4092 nobody 27u IPv4 711400163 0t0 TCP 198-57-162-52.unifiedlayer.com:http->108-235-145-129.lightspeed.irvnca.sbcglobal.net:56037 (FIN_WAIT2) httpd 4092 nobody 28u IPv4 711390092 0t0 TCP 198-57-162-52.unifiedlayer.com:http->c-98-239-167-37.hsd1.pa.comcast.net:57988 (ESTABLISHED) httpd 4092 nobody 29u IPv4 711206563 0t0 TCP 198-57-162-52.unifiedlayer.com:http->252.sub-70-199-129.myvzw.com:6179 (ESTABLISHED) httpd 4092 nobody 30u IPv4 711271036 0t0 TCP 198-57-162-52.unifiedlayer.com:https->5.71.20.78:62114 (ESTABLISHED) httpd 4092 nobody 31u IPv4 711385050 0t0 TCP 198-57-162-52.unifiedlayer.com:http->64.128.135.115:17547 (ESTABLISHED) httpd 4092 nobody 32u IPv4 711400374 0t0 TCP 198-57-162-52.unifiedlayer.com:https->mbb5336d0.tmodns.net:36277 (ESTABLISHED) httpd 4092 nobody 33u IPv4 711378965 0t0 TCP 198-57-162-52.unifiedlayer.com:http->us-chi-exchange.brightstarcorp.com:47707 (ESTABLISHED) httpd 4092 nobody 34u IPv4 711394997 0t0 TCP 198-57-162-52.unifiedlayer.com:https->209-152-137-170.dsl.mcmo.net:64370 (ESTABLISHED) httpd 4092 nobody 35u IPv4 711391122 0t0 TCP 198-57-162-52.unifiedlayer.com:http->c-76-125-136-230.hsd1.oh.comcast.net:57093 (ESTABLISHED) httpd 4092 nobody 36u IPv4 711398994 0t0 TCP 198-57-162-52.unifiedlayer.com:http->lookup.eatright.org:60021 (ESTABLISHED) httpd 4092 nobody 37u IPv4 711400972 0t0 TCP 198-57-162-52.unifiedlayer.com:http->modemcable126.224-130-66.mc.videotron.ca:55970 (ESTABLISHED) httpd 4092 nobody 38u IPv4 711379727 0t0 TCP 198-57-162-52.unifiedlayer.com:http->c-75-65-146-83.hsd1.ms.comcast.net:50309 (ESTABLISHED) httpd 4092 nobody 39u IPv4 711390427 0t0 TCP 198-57-162-52.unifiedlayer.com:https->66-87-130-177.pools.spcsdns.net:41127 (ESTABLISHED) httpd 4092 nobody 40u IPv4 711401179 0t0 TCP 198-57-162-52.unifiedlayer.com:http->8.225.196.132:20608 (ESTABLISHED) httpd 4092 nobody 41u IPv4 711142701 0t0 TCP 198-57-162-52.unifiedlayer.com:http->c-24-15-33-77.hsd1.il.comcast.net:54631 (ESTABLISHED) 
20
  • 1
    What does netstat -pn show? Commented Apr 29, 2013 at 19:09
  • It shows a bunch of requests in the TIME_WAIT state - I've updated with excerpt above. Commented Apr 29, 2013 at 19:22
  • The high values for SS indicate that the connections in TIME_WAIT have nothing to do with the workers in 'R' state. Commented Apr 29, 2013 at 19:37
  • What do the apache (error)-logs show? Commented Apr 29, 2013 at 19:38
  • In the process table of this server, do you see as much (php)-processes as 'R'-state workers? "ps auxww | grep php" Commented Apr 29, 2013 at 19:39

2 Answers 2

17
+50

@ShaneN Your server is reaching choke of resources point-of-failure. Maybe you're getting without ephemeral port range, or without system file-descriptors, which limits the number of sockets your server can use to answer requests. A reading state request means that the socket was opened, but the request line never reached the server, it's a known form of DoS called SlowLoris. To mitigate this kind of DoS, I'd first:

Add to /etc/sysctl.conf, the following entries:

net.ipv4.tcp_fin_timeout = 10 which sets the timeout of FIN sockets to 10seconds.

net.ipv4.ip_local_port_range = "15000 61000" which sets the ephemeral port range to about 46000 possible open ports per IP.

net.core.somaxconn = 1024 - default 128 - The net.core.somaxconn value has an important role. It limits the maximum number of requests queued to a listen socket.

net.core.netdev_max_backlog = 2000 - default 1000 - which increases the TCP queue of incoming requests.

After adding those lines to the file, issue sysctl -p to load the kernel parameters at the current runtime and check your rush hour to see if your problem's still there.

3
  • 1
    Wow, great instructions Marcel - will try this and let you know how it works... Commented May 6, 2013 at 22:19
  • 4
    Marcel - thanks for your answer. We tried the steps above, and it did not solve the problem. But the SlowLoris info led us to an apache plugin Mod ReqTimeout (httpd.apache.org/docs/2.4/mod/mod_reqtimeout.html) that allowed us to set timeouts on the requests. Problem appears to be solved! Thank you! Commented May 8, 2013 at 21:55
  • Nice that you've sort things out! But hey, you don't necessarily need this mod if you tweak the timeouts variables that are already available for you. IMO just tweaking your timeouts in KeepAlive settings (which is used by SlowLoris to thrash servers) and global timeout vars should do the trick! ;) Commented May 9, 2013 at 0:13
0

This may be a denial-of-service attack meant to occupy listening sockets and tie up your server. If you can determine that to be the case, I'd suggest trying a frontend proxy such as nginx to take the stress off of Apache. Proxies like this can handle a massive amount of connections and will at least get you some breathing room to figure out what's going on.

2
  • We pointed our domain name to another server and the same behaviour is happening. Do most DOS attacks happen via IP or domain? Commented May 2, 2013 at 17:53
  • DOS attacks on L7 are more effective today - 50/50, Scans - IP Commented May 5, 2013 at 3:18

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.