2

Description of the problem

Regularly, cron php processes crash on our production server, which result in mails with the following body :

PHP Fatal error: PHP Startup: apc_mmap: mmap failed: in Unknown on line 0 Segmentation fault (core dumped)

I think the Segmentation fault (core dumped) should result in core files being handled by apport and then written in /var/crashes, but the files I can see there are there since yesterday, although the last crash occured today :

-rw-r----- 1 root whoopsie 1138528 mai 22 04:09 _usr_bin_php5.0.crash -rw-r----- 1 frontoffice whoopsie 1166373 mai 20 18:00 _usr_bin_php5.1005.crash -rw-r----- 1 frontoffice whoopsie 81622658 mai 22 00:05 _usr_sbin_php5-fpm.1005.crash 

I tried to download the last one anyway, and ran gdb /usr/sbin/php5-fpm /tmp/_usr_sbin_php5-fpm.1005.crash, only to be told that the file is not a core file (its format was not recognized).

Here is the server's apc configuration :

cat /etc/php5/cli/conf.d/20-apc.ini extension=apc.so apc.shm_size=512M apc.ttl=3600 apc.user_ttl=3600 apc.enable_cli=1 

I'm mostly worried about the apc.shm_size… isn't it too high or too low ? I understand it has to do with the size of memory segments.

Question(s)

  1. What could be the problem ?
  2. How can I troubleshoot it (how can I get a valid core file ?) ?

System information

free total used free shared buffers cached Mem: 5081296 4354684 726612 0 374744 959968 -/+ buffers/cache: 3019972 2061324 Swap: 522236 516888 5348 cat /etc/lsb-release DISTRIB_ID=Ubuntu DISTRIB_RELEASE=12.04 DISTRIB_CODENAME=precise DISTRIB_DESCRIPTION="Ubuntu 12.04.2 LTS" php -v PHP 5.4.17-1~precise+1 (cli) (built: Jul 17 2013 18:14:06) Copyright (c) 1997-2013 The PHP Group Zend Engine v2.4.0, Copyright (c) 1998-2013 Zend Technologies 

php -i excerpt :

Configuration apc APC Support => enabled Version => 3.1.13 APC Debugging => Disabled MMAP Support => Enabled MMAP File Mask => Locking type => pthread mutex Locks Serialization Support => php Revision => $Revision: 327136 $ Build Date => Nov 20 2012 18:41:36 Directive => Local Value => Master Value apc.cache_by_default => On => On apc.canonicalize => On => On apc.coredump_unmap => Off => Off apc.enable_cli => On => On apc.enabled => On => On apc.file_md5 => Off => Off apc.file_update_protection => 2 => 2 apc.filters => no value => no value apc.gc_ttl => 3600 => 3600 apc.include_once_override => Off => Off apc.lazy_classes => Off => Off apc.lazy_functions => Off => Off apc.max_file_size => 1M => 1M apc.mmap_file_mask => no value => no value apc.num_files_hint => 1000 => 1000 apc.preload_path => no value => no value apc.report_autofilter => Off => Off apc.rfc1867 => Off => Off apc.rfc1867_freq => 0 => 0 apc.rfc1867_name => APC_UPLOAD_PROGRESS => APC_UPLOAD_PROGRESS apc.rfc1867_prefix => upload_ => upload_ apc.rfc1867_ttl => 3600 => 3600 apc.serializer => default => default apc.shm_segments => 1 => 1 apc.shm_size => 512M => 512M apc.shm_strings_buffer => 4M => 4M apc.slam_defense => On => On apc.stat => On => On apc.stat_ctime => Off => Off apc.ttl => 3600 => 3600 apc.use_request_time => On => On apc.user_entries_hint => 4096 => 4096 apc.user_ttl => 3600 => 3600 apc.write_lock => On => On php -m [PHP Modules] apc bcmath bz2 calendar Core ctype curl date dba dom ereg exif fileinfo filter ftp gd gettext hash iconv imagick intl json ldap libxml mbstring memcache memcached mhash mysql mysqli openssl pcntl pcre PDO pdo_mysql pdo_pgsql pdo_sqlite pgsql Phar posix Reflection session shmop SimpleXML soap sockets SPL sqlite3 standard sysvmsg sysvsem sysvshm tidy tokenizer wddx xml xmlreader xmlwriter zip zlib [Zend Modules] ulimit -a core file size (blocks, -c) 0 data seg size (kbytes, -d) unlimited scheduling priority (-e) 0 file size (blocks, -f) unlimited pending signals (-i) 39531 max locked memory (kbytes, -l) 64 max memory size (kbytes, -m) unlimited open files (-n) 1024 pipe size (512 bytes, -p) 8 POSIX message queues (bytes, -q) 819200 real-time priority (-r) 0 stack size (kbytes, -s) 8192 cpu time (seconds, -t) unlimited max user processes (-u) 39531 virtual memory (kbytes, -v) unlimited file locks (-x) unlimited 
4
  • What is the output of cat /proc/sys/kernel/core_pattern Commented Jun 5, 2014 at 21:23
  • /usr/share/apport/apport %p %s %c Commented Jun 5, 2014 at 21:30
  • There should be a pipe symbol as the first character, is that not the case? Commented Jun 5, 2014 at 21:33
  • Yeah, sorry, I missed it when copy/pasting Commented Jun 6, 2014 at 11:44

2 Answers 2

1
+50

The core file needs to be read on a system that is at least very similar to the one where the crash happened. In particular you need to have the same versions of the binary and all involved libraries in order for the pointers to line up. Usually it's easiest to run gdb on the machine where the crash happened. You'll also need to have versions of the binary and libraries installed that have the symbolic data you need to identify locations in the source files where things happened. That might mean the dev versions of the various libraries, but It depends what distribution of linux you run.

Are you sure you have the right version of APC installed? Eg it solved this person's problem: https://stackoverflow.com/questions/14756385/php-fatal-error-php-startup-apc-mmap-mmap-failed-in-unknown-on-line-0

Is APC failing for web processes as well as command line ones? If it only fails for one of those, then check that both php packages are the correct versions to work with your version of APC.

The first two dump files you listed look very small to me. Just over 1 MB. PHP would usually get bigger than that before it gets as far as running any of your code. That's likely consistent with failing before loading the code though, and given APC is involved, that's likely. The fpm one is a web process, not a cron job (unless your cron calls php via the web interface?)

Setting apc.shm_size to 512MB may or not be optimal for efficiency, but I wouldn't expect it to be the cause of a segfault. Corrupt data in your APC cache could conceivably be the problem though, so I suggest you clear the cache. The normal process is to use an apc.php file which is likely distributed with apc. Vendor distributions vary on that, but it is included with the upstream source code, so you should be able to get a copy easily enough. That gives you a web interface for looking at the state of your cache, and for clearing it. If APC is failing to the point where that doesn't work, I'm not sure what the process is. Probably locate the cache, delete it, and reinstall APC if needed to rebuild it. (kinda dirty approach, but low effort if you can afford a brief outage).

2
  • Thanks for the very complete answer. I think apc is failing for web processes only. I think the fpm crash is from an unrelated crash. I clear the cache the hard way : php5-fpm reload does the trick because we set the cache to be shared between the CLI and the FPM versions. I talked with my both and he said we're going to upgrade soon to Ubuntu 14.04 (with newer php and apc versions). But your answer says the a greate deal about gdb and core files, so accepted and bounty granted. Thanks! Commented Jun 9, 2014 at 10:21
  • While not what you asked, it's worth noting that apc.php is the tool to look at for things like your cache hit rate, which is important for deciding if you have a sensible size apc.shm_size setting for efficiency. Commented Jun 10, 2014 at 14:02
4

This should really be a comment, but it's a bit long

isn't it too high or too low ?

If you don't know how, then how should we? You haven't told us how much RAM and swap there is, how much is used for other stuff. You haven't told us how much of the APC memory is used before the system crashes.

file is not a core file (its format was not recognized).

Have you checked the ulimit? Most likely the file has been truncated. Regardless, a segmentation fault suggests an issue within PHP itself (or APC, or an extension). Were you planning on fixing it yourself? Don't get me wrong - the guys who write the stuff will welcome well researched and documented bug reports - but the first thing you should be looking at (and including in your question here) is the version of PHP, the extensions installed and the version of APC.

7
  • I knew my question lacked information, but I had no idea what would matter and what wouldn't. I'm going to provide it, and then, if nothing looks obviously wrong with it, I'll file a bug report. Commented May 23, 2014 at 9:01
  • Also, I'd like to have a core file to attach to my bug report, this is an important part of my question. Commented May 23, 2014 at 9:02
  • Regarding the ulimit what does it have to do exactly with the problem ? I don't know ulimit well… Commented May 23, 2014 at 9:06
  • man ulimit is a good place to start. A core file won't be very welcome on a bug report - just the stack trace. Commented May 23, 2014 at 14:43
  • I had a look at it before posting the last comment, but I still don't get what limit I should change, and what file you're talking about when you say "the file has been truncated". As per the stack trace, I don't have one, that's why I'm a bit reluctant to file a bug report. Technically, I do have one, but "in Unknown on line 0" is not really helpful, is it ? Commented May 23, 2014 at 14:58

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.