DEV Community

Leon Nunes for Kubernetes Community Days Chennai

Posted on • Edited on

Linux Troubleshooting - A simple primer

In the beginning!

Linux has been around for a long time, most of the web runs on Linux, FreeBSD. Linux is everywhere, but there is a lot of things under the surface that one has to learn when troubleshooting Linux servers. Let's see some of the commands I used to use when I was doing Techsupport for Customers.

Commands

netcat/nc

Customers often say that my domain isn't working on say a certain port, or the customer needs to host an application on a port but it's not working

$ sudo nc --verbose google.com 443 Ncat: Connected to 216.58.203.14:443. Ncat: 0 bytes sent, 0 bytes received in 0.10 seconds. --verbose Verbose output -z Zero-I/O mode, report connection status only 
Enter fullscreen mode Exit fullscreen mode

ss

From the manpage(cause even I didn't know this) ss - another utility to investigate sockets, say you have an application running on port 9001 how do you know if it's actually listening to connections and how do you know if it's listening on all interfaces(0.0.0.0)? ss helps you figure that out.

ss -patun | grep -w 9001 tcp LISTEN 0 4096 *:9001 *:* users:(("rootlessport",pid=691027,fd=11)) 
Enter fullscreen mode Exit fullscreen mode

As you can see this command gives you the Protocol,If it's listening or not, the interface(*9001 is for all) and also the application that is using it along with the Process ID(PID) and File Descriptor(FD).

Atop

Atop Process manager
Ever had a customers server or your server gone out of memory, and nothing could pinpoint the reason? Atop solves that for you, by default Linux servers do not store any history of the processes, atop does.

vmstat, pidstat, iostat.

Vmstat will display all the virtual memory stats
Vmstat output details

vmstat -w -S M 1 9 --procs-- -----------------------memory---------------------- ---swap-- -----io---- -system-- --------cpu-------- r b swpd free buff cache si so bi bo in cs us sy id wa st 2 0 3545 5650 121 3052 0 0 69 183 61 216 22 9 68 0 0 0 0 3545 5649 121 3052 0 0 0 1556 1466 4690 2 2 96 0 0 0 0 3545 5649 121 3053 0 0 0 108 1160 4263 1 1 98 0 0 0 0 3545 5649 121 3052 0 0 0 68 1584 5925 2 1 96 0 0 0 0 3545 5648 121 3052 0 0 0 20 1213 4872 1 1 98 0 0 0 0 3545 5649 121 3052 0 0 0 0 995 4079 1 1 99 0 0 1 0 3545 5650 121 3052 0 0 0 88 1465 5533 2 1 97 0 0 0 0 3545 5650 121 3052 0 0 0 0 1290 5131 1 1 98 0 0 0 0 3545 5650 121 3052 0 0 0 96 1855 6303 2 2 96 0 0 
Enter fullscreen mode Exit fullscreen mode

Then there is iostat that will give you details about the I/O pressure and cache

avg-cpu: %user %nice %system %iowait %steal %idle 22.3% 0.0% 9.2% 0.1% 0.0% 68.4% tps kB_read/s kB_wrtn/s kB_dscd/s kB_read kB_wrtn kB_dscd Device 7.09 29.8k 143.2k 0.0k 5.4G 26.1G 0.0k sda 1.74 33.9k 25.6k 0.0k 6.2G 4.7G 0.0k sdb 
Enter fullscreen mode Exit fullscreen mode

More details here

And then there is pidstat which will give you process statistics, such as which process is taking a lot of memory and CPU More Details

There are many such tools in the sysstat package
There is a really nice article by Netflix read it here

Using the system journal correctly will save you a lot of headache.
Few Journal commands I use

journalctl --since=today -g oom 
Enter fullscreen mode Exit fullscreen mode

This will essentially do a journal grep for the keyword oom no more journalctl | grep please.

journalctl -t httpd.service --since=today 
Enter fullscreen mode Exit fullscreen mode

This will give you details about the httpd service only.

Checking Disk space.

This is often looked over, the quickest way to check disk space is

$ df -Th 
Enter fullscreen mode Exit fullscreen mode

That's it nothing more.

Need to find files that are occupying disk? Not an issue.

$ sudo du -ach / | awk '$1 ~/[G]/ {print}' 
Enter fullscreen mode Exit fullscreen mode

This will print files that are in GB's, you can also use find for finding files.

top

The easiest way to check server load is by launching the top command the top command is an essential tool in Linux troubleshooting.

Checking Server port usage.

If you ever notice a random port and you want to know what process is occupying it and the list of files open, simply run.

lsof -i :9001 
Enter fullscreen mode Exit fullscreen mode

Checking DNS resolution

DNS is something that is the most important thing when it comes to servers and domains.

dig a domain.com ;; ANSWER SECTION: google.com. 188 IN A 142.250.76.206 
Enter fullscreen mode Exit fullscreen mode

Will tell you if your DNS resolution is working

Checking How much Memory is left

The free command is used to check the memory usage

free -h total used free shared buff/cache available Mem: 15Gi 7.2Gi 4.5Gi 1.5Gi 3.6Gi 6.2Gi Swap: 10Gi 3.4Gi 7.4Gi 
Enter fullscreen mode Exit fullscreen mode

That's all, there are probably a few more commands I use, in case I remember them I will let you all know.

In case you would like to chat with me or have a discussion I'm always available at @mediocreDevops

Top comments (0)