2

Environment: Ubuntu 10.04 LTS, Passenger, Nginx 1.0.6, MySQL, Ruby 1.9.2, Rails 3.1

After some amount of time, the server ends up with a gradually increasing number of processes that are stuck at 100% CPU

 PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 2393 avitus 20 0 496m 381m 1392 R 100 9.4 25:10.74 Rack: /home/web ... 

Running a strace on any of the stuck PID's gives the following:

Process 2393 attached with 3 threads - interrupt to quit [pid 2396] futex(0x8ca80e4, FUTEX_WAIT_PRIVATE, 2, NULL <unfinished ...> [pid 2394] restart_syscall(<... resuming interrupted call ...>) = -1 ETIMEDOUT (Connection timed out) [pid 2394] gettimeofday({1322590778, 346573}, NULL) = 0 [pid 2394] futex(0x821db60, FUTEX_WAKE_PRIVATE, 1) = 0 [pid 2394] clock_gettime(CLOCK_REALTIME, {1322590778, 346885177}) = 0 [pid 2394] futex(0x821db84, FUTEX_WAIT_PRIVATE, 33872659, {0, 9687823}) = -1 ETIMEDOUT (Connection timed out) [pid 2394] gettimeofday({1322590778, 356921}, NULL) = 0 [pid 2394] futex(0x821db60, FUTEX_WAKE_PRIVATE, 1) = 0 [pid 2394] clock_gettime(CLOCK_REALTIME, {1322590778, 357196244}) = 0 [pid 2394] futex(0x821db84, FUTEX_WAIT_PRIVATE, 33872661, {0, 9724756}) = -1 ETIMEDOUT (Connection timed out) [pid 2394] gettimeofday({1322590778, 367240}, NULL) = 0 [pid 2394] futex(0x821db60, FUTEX_WAKE_PRIVATE, 1) = 0 [pid 2394] clock_gettime(CLOCK_REALTIME, {1322590778, 367459723}) = 0 [pid 2394] futex(0x821db84, FUTEX_WAIT_PRIVATE, 33872663, {0, 9780277}) = -1 ETIMEDOUT (Connection timed out) [pid 2394] gettimeofday({1322590778, 377586}, NULL) = 0 [pid 2394] futex(0x821db60, FUTEX_WAKE_PRIVATE, 1) = 0 [pid 2394] clock_gettime(CLOCK_REALTIME, {1322590778, 377807840}) = 0 [pid 2394] futex(0x821db84, FUTEX_WAIT_PRIVATE, 33872665, {0, 9778160}) = -1 ETIMEDOUT (Connection timed out) [pid 2394] gettimeofday({1322590778, 387932}, NULL) = 0 [pid 2394] futex(0x821db60, FUTEX_WAKE_PRIVATE, 1) = 0 [pid 2394] clock_gettime(CLOCK_REALTIME, {1322590778, 388162450}) = 0 [pid 2394] futex(0x821db84, FUTEX_WAIT_PRIVATE, 33872667, {0, 9769550}) = -1 ETIMEDOUT (Connection timed out) 

Including the 'c' flag for strace gives:

Process 2393 attached with 3 threads - interrupt to quit Process 2393 detached Process 2394 detached Process 2396 detached % time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- --------- ---------------- 94.97 0.003172 2 1489 744 futex 3.74 0.000125 0 745 clock_gettime 1.29 0.000043 0 745 gettimeofday 0.00 0.000000 0 1 1 restart_syscall ------ ----------- ----------- --------- --------- ---------------- 100.00 0.003340 2980 745 total 

I can kill -9 the stuck processes and the application and server appear to carry on happily. I've run out of ideas on how to proceed with debugging so if anyone has any advice as to the cause or other avenues of investigation it would be great to hear.

1
  • hello andy, have u find the reason of this problem? i get the same problem Commented Oct 8, 2012 at 9:13

3 Answers 3

2

Try setting passenger_spawn_method to conservative in Passenger. I'm having this issue with Mongo and came across:

http://code.google.com/p/phusion-passenger/issues/detail?id=684

and:

https://github.com/rails/rails/issues/1339

I don't know why it's not working, but hopefully that will get you going if you haven't figured out the solution already.

0

That particular behavior (checking futex every 20 ms and then checking the time of day) appears to be the normal behavior for an idling Ruby process:

http://www.ruby-forum.com/topic/192255

0

try running the following command on your server

sudo date -s "`date`"

Source: http://www.redmine.org/boards/2/topics/31731

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.