It is always crucial to understand the issue. There should be the right approach or a step-by-step process to be followed to troubleshoot the issues. Doesn’t matter if you are a Software Developer or DevOps Engineer or Architect. Unix. /Linux is used widely, and you should be aware of the issues and the correct approach to resolve them.
Let’s discuss a few of them:
Issue 1: Server is not reachable or unable to connect
Approach / Solution:
├── Ping the server by Hostname and IP Address │ ├── Hostname/IP Address is pingable │ │ ├── The Issue might be on the client side as the server is reachable │ ├── Hostname is not pingable but IP Address is pingable │ │ ├── Could be the DNS issue │ │ │ ├── check /etc/hosts │ │ │ ├── check /etc/resolv.conf │ │ │ ├── check /etc/nsswitch.conf │ │ │ ├── (Optional) DNS can also be defined in the /etc/sysconfig/network-scripts/ifcfg-<interface> │ ├── Hostname/IP Address both are not pingable │ │ ├── Check the other server on the same network to see if there is it a Network side access issue or other overall something bad │ │ │ ├── False: The issue is not overall network side but with that host/server │ │ │ ├── True: Might be an overall network-side issue │ │ ├── Logged into the server by Virtual Console, if the server is Powered ON. Check the uptime │ │ ├── Check if the server has the IP, and has UP status of the Network interface │ │ │ ├── (Optional) Also check IP-related information from /etc/sysconfig/network-scripts/ifcfg-<interface> │ │ ├── Ping the gateway, also check routes │ │ ├── Check Selinux, Firewall rules │ │ ├── Check physical cable conn
Issue 2: Unable to connect to a website or an application
Approach / Solution:
├── Ping the server by Hostname and IP Address │ ├── False: Above Troubleshooting Diagram "Server is not reachable or cannot connect" │ ├── True: Check the service availability by using the telnet command with port │ │ ├── True: Service is running │ │ ├── False: Service is not reachable or running │ │ │ ├── Check the service status using systemctl or other commands │ │ │ ├── Check the firewall/selinux │ │ │ ├── Check the service logs │ │ │ ├── Check the service configuration
Issue 3: Unable to ssh as root or any other user.
Approach / Solution:
├── Ping the server by Hostname and IP Address │ ├── False: Above Troubleshooting Diagram "Server is not reachable or cannot connect" │ ├── True: Check the service availability by using the telnet command with port │ │ ├── True: Service is running │ │ │ ├── Issue might be on the client side │ │ │ ├── User might be disabled, no-login shell, disabled root login and other configuration │ │ ├── False: Service is not reachable or running │ │ │ ├── Check the service status using systemctl or other commands │ │ │ ├── Check the firewall/selinux │ │ │ ├── Check the service logs │ │ │ ├── Check the service configuration
Issue 4: Disk Space is full issue or add/extend disk space
Approach / Solution:
├── System Performance degradation detection │ ├── Application getting slow/unresponsive │ ├── Commands are not running (For Example: as / disk space is full) │ ├── Cannot do logging and other etc. ├── Analyse the issue │ ├── df command to find the problematic filesystem space issue ├── Action │ ├── After finding the specific filesystem, use du command in that filesystem to get which files/directories are large │ ├── Compress/remove big files │ ├── Move the items to another partition/server │ ├── Check the health status of the disks using badblocks command (For Example, #badblocks -v /dev/sda) │ ├── Check which process is IO Bound (using iostat) │ ├── Create a link to file/dir ├── New disk addition │ ├── Simple partition │ │ ├── Add disk to VM │ │ ├── Check the new disk with df/lsblk command │ │ ├── fdisk to create the partition. Better to have LVM partition │ │ ├── Create filesystem and mount it │ │ ├── fstab entry for persistent │ ├── LVM Partition │ │ ├── Add disk to VM │ │ ├── Check the new disk with df/lsblk command │ │ ├── fdisk to create LVM partition │ │ ├── PV, VG, LV │ │ ├── Create filesystem and mount it │ │ ├── fstab entry for persistent │ ├── Extend LVM partition │ │ ├── Add disk, and create LVM partition │ │ ├── Add LVM partition (PV) in existing VG │ │ ├── Extend LV and resize the filesystem
Issue 5: Filesystem corrupted
Approach / Solution:
├── One of the errors that cause the system unable to BOOT UP ├── Check /var/log/messages, dmesg, and other log files ├── If we have bad sector logs, we have to run fsck │ ├── True: │ │ ├── reboot the system into rescue mode by booting it from CDROM by applying ISO │ │ ├── proceed with option 1, which mounts the original root filesystem under /mnt/sysimage. │ │ ├── edit fstab entries or create a new file with the help of blkid and reboot.
Top comments (0)