3

Recently, we have been noticing CPU spikes on our production environment caused by redis which can be seen below:

enter image description here

To combat this issue, I have been restarting the redis server about twice a day :( which is obviously far from ideal. I'd like to identify the root cause.

Here are some things I have looked into so far:
1) Look into any anomalies in the redis log file. The following seems suspicious:

enter image description here

2) Researched nginx access logs to see if we are experiencing unusually high traffic. The answer is no.

3) New Relic revealed that the issue started on Nov 21st, 16` (about a month ago) but no code was released around that time.

Here are some details about our setup:

Redis server: Redis server v=2.8.17 sha=00000000:0 malloc=jemalloc-3.6.0 bits=64 build=64a9cf396cbcc4c7

PHP: 5.3.27 with fpm

Redis configuration:

daemonize yes pidfile /var/run/redis/redis.pid port 6379 timeout 0 tcp-keepalive 0 loglevel notice logfile /var/log/redis/redis.log syslog-enabled yes databases 16 save 900 1 save 300 10 save 60 10000 stop-writes-on-bgsave-error no rdbcompression yes rdbchecksum yes dbfilename redis.rdb dir /var/lib/redis/ slave-serve-stale-data yes slave-read-only yes repl-disable-tcp-nodelay no slave-priority 100 maxmemory 15GB appendonly no appendfsync everysec no-appendfsync-on-rewrite no auto-aof-rewrite-percentage 100 auto-aof-rewrite-min-size 64mb lua-time-limit 5000 slowlog-max-len 128 notify-keyspace-events "" hash-max-ziplist-entries 512 hash-max-ziplist-value 64 list-max-ziplist-entries 512 list-max-ziplist-value 64 set-max-intset-entries 512 zset-max-ziplist-entries 128 zset-max-ziplist-value 64 activerehashing yes client-output-buffer-limit normal 0 0 0 client-output-buffer-limit slave 256mb 64mb 60 client-output-buffer-limit pubsub 32mb 8mb 60 hz 10 aof-rewrite-incremental-fsync yes include /etc/redis/conf.d/local.conf 

Framework: Magento 1.7.2 with Cm_Cache_Backend_Redis

Please let me know if given the above information there is anything I can do to mitigate the high cpu usage.

8
  • I just realized: the problem looks like it is with the yam command. Any idea what that is? The redis server process is typically named, redis-server. Commented Dec 15, 2016 at 14:19
  • Yikes! the only other reference I could find to yam was stackoverflow.com/questions/37897728/aws-unnecessary-script Commented Dec 15, 2016 at 14:25
  • Alright, yam looks like it is for yum/apt mirror. rpmfind.net/linux/rpm2html/search.php?query=yam That said, definitely check for the security breach because it looks like your redis is accessible to the world without authentication. Commented Dec 15, 2016 at 14:36
  • Hey @2ps thanks so much for the details, it is very helpful. I have noticed that the command column on top for the redis process is sometimes yam and sometimes redis-server. I am wondering if you have any input as far as how exactly this yam command gets triggered. How would I get down to the bottom of this? Our server uses ssh keys for user login but I just confirmed our redis is accessible to the outside world by simply specifying the host. YIKES. That being said, how would someone with access to redis be able to configure it to run YAM? Commented Dec 15, 2016 at 15:59
  • Check /opt/yam/yam to see if it exists. If it does, you are likely compromised. Also check /root/.ssh/authorized_keys and make sure only SSH keys that you know about are there. As for the vector of compromise, here is a proposed hack http://antirez.com/news/96 that can be used to download a script to your computer and run it periodically. You’ll also want to check each user's crontab and the global crontab to make sure yam does not appear there. Commented Dec 15, 2016 at 16:10

1 Answer 1

3

VERY IMPORTANT UPDATE:

Your server may have been hacked. It’s not Redis that is causing the high CPU usage, but a separate command called yam (take a look at the far right of your htop, I missed it the first time). The yam command is used in a well-known exploit of Redis and often results in high CPU usage. You’ll want to double-check to make sure your server is secure.

Here are some articles and links you can refer to if you want to learn more about the vulnerability and how to secure yourself:


Here is my checklist for magento/redis, er, performance issues:

  1. Make sure you are on a newish version of Redis, like 3.2, I personally prefer redis32u from the IUS repository if on CentOS.
  2. Check the size of your Redis database, it should be in /var/lib/redis, and make sure it is relatively small.
  3. Verify that you have enough ram for Redis. You’ve specified a maxmemory of 15GB, which is really overkill for magento. I typically use something closer to 256mb. If you are using Redis that much (!!!!!!), you likely have other problems in your magento stack.
  4. Make sure you have vm overcommit setting set in syscntl. https://redis.io/topics/admin (see this link for more details on what you need)
  5. Make sure you have sufficient open file limits to handle the number of connections to Redis.

Generally speaking, the log file isn’t suspicious because your Redis save settings tell Redis to save every minute if there have been > 10000 write, every five minutes if there have been > 10 writes, and every 15 minutes if there have been > 1 write. So it is essentially persisting the info back to disk every minute, which shouldn’t be that burdensome.

3

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.