I’m using the tips on this SO thread.
ps -awux|grep df root 15826 0.0 0.0 0 0 ? I< May22 0:00 [cifs-dfscache] myuser 3086246 0.0 0.0 216860 3212 ? Ss 16:06 0:02 bash -c while [ -d /proc/$PPID ]; do sleep 1;head -v -n 8 /proc/meminfo; head -v -n 2 /proc/stat /proc/version /proc/uptime /proc/loadavg /proc/sys/fs/file-nr /proc/sys/kernel/hostname; tail -v -n 16 /proc/net/dev;echo '==> /proc/df <==';df;echo '==> /proc/who <==';who;echo '==> /proc/end <==';echo '##Moba##'; done myuser 3137650 0.0 0.0 215348 616 ? D 16:27 0:00 df So I cat the process 3137650 stack:
cat /proc/3137650/stack [<0>] autofs_wait+0x25b/0x723 [<0>] autofs_mount_wait+0x49/0xf0 [<0>] autofs_d_automount+0xdb/0x200 [<0>] follow_managed+0x110/0x2c0 [<0>] walk_component+0x1e9/0x2f0 [<0>] path_lookupat+0x70/0x120 [<0>] filename_lookup+0x97/0x180 [<0>] user_statfs+0x33/0xa0 [<0>] __do_sys_statfs+0x10/0x30 [<0>] do_syscall_64+0x5b/0xf0 [<0>] entry_SYSCALL_64_after_hwframe+0x44/0xa9 And the other process id 3086246:
cat /proc/3086246/stack [<0>] do_wait+0x1b3/0x220 [<0>] kernel_wait4+0x96/0x120 [<0>] do_syscall_64+0x5b/0xf0 [<0>] entry_SYSCALL_64_after_hwframe+0x44/0xa9 Then:
xargs -0 -n 1 echo < /proc/3137650/environ SHELL=/bin/bash MATHEMATICA_HOME=/usr/local/Wolfram/Mathematica/11.3 JAVA_HOME=/usr/lib/jvm/java-1.8.0-openjdk XDG_CONFIG_HOME=/home/myuser/.config SPARK_LOCAL_HOSTNAME=localhost LMOD_DIR=/usr/share/lmod/lmod/libexec PWD=/home/myuser LOGNAME=myuser XDG_SESSION_TYPE=tty MODULESHOME=/usr/share/lmod/lmod MANPATH=/usr/share/lmod/lmod/share/man: CUDA_INCLUDE_DIRS=/usr/include/cuda SPARK_MASTER_IP=127.0.0.1 HOME=/home/myuser SSH_ASKPASS=/usr/libexec/openssh/gnome-ssh-askpass LANG=en_US.UTF-8 XDG_CONFIG_DIR=/home/myuser/.config LMOD_SETTARG_FULL_SUPPORT=no CUDA_INC_PATH=/usr/include/cuda LMOD_VERSION=8.2.10 SSH_CONNECTION=x.x.x.x 51696 x.x.x.x 22 MODULEPATH_ROOT=/usr/share/modulefiles XDG_SESSION_CLASS=user LMOD_PKG=/usr/share/lmod/lmod HADOOP_HOME=/usr/local/bin/hadoop-2.9.0 GUROBI_HOME=/home/student/gurobi811/linux64/ LESSOPEN=||/usr/bin/lesspipe.sh %s USER=kudyba LMOD_ROOT=/usr/share/lmod SHLVL=1 BASH_ENV=/usr/share/lmod/lmod/init/bash LMOD_sys=Linux SPARK_HOME=/usr/local/bin/spark SPARK_LOCAL_IP=127.0.0.1 XDG_SESSION_ID=7778 LD_LIBRARY_PATH=:/home/student/gurobi811/linux64//lib XDG_RUNTIME_DIR=/run/user/6105 SSH_CLIENT=x.x.x.x 51696 22 PIG_INSTALL=/usr/local/bin/pig-0.17.0 SPARK_EXAMPLES_JAR=/usr/local/bin/spark-2.1.1-bin-hadoop2.7/examples/jars/spark-examples_2.11-2.1.1.jar KDEDIRS=/usr XDG_DATA_DIRS=/home/myuser/.local/share/flatpak/exports/share:/var/lib/flatpak/exports/share:/usr/local/share:/usr/share PATH=/usr/local/bin/anaconda3/bin:/home/users/mzilversmit/ncbi-blast-2.7.1+/bin:/usr/lib64/ccache:/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/home/users/students/mchen177/gurobi811/linux64//bin:/usr/local/bin/spark/bin:/usr/local/bin/pig-0.17.0/bin:/opt/dell/srvadmin/bin:/usr/local/bin/spark/bin MODULEPATH=/etc/modulefiles:/usr/share/modulefiles:/usr/share/modulefiles/Linux:/usr/share/modulefiles/Core:/usr/share/lmod/lmod/modulefiles/Core SPARK_CLASSPATH=/usr/share/java/mysql-connector-java.jar DBUS_SESSION_BUS_ADDRESS=unix:path=/run/user/6105/bus LMOD_CMD=/usr/share/lmod/lmod/libexec/lmod BASH_FUNC_ml%%=() { eval $($LMOD_DIR/ml_cmd "$@") } BASH_FUNC_module%%=() { eval $($LMOD_CMD bash "$@") && eval $(${LMOD_SETTARG_CMD:-:} -s sh) } _=/usr/bin/df and from the other process:
xargs -0 -n 1 echo < /proc/3086246/environ USER=myuser LOGNAME=myuser HOME=/home/myuser PATH=/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin SHELL=/bin/bash XDG_SESSION_ID=7778 XDG_RUNTIME_DIR=/run/user/xxxx DBUS_SESSION_BUS_ADDRESS=unix:path=/run/user/xxxx/bus XDG_SESSION_TYPE=tty XDG_SESSION_CLASS=user SSH_CLIENT=edited 51696 22 SSH_CONNECTION=edited 51696 edited 22 And just to confirm the parent process id:
ps --ppid 3086246 PID TTY TIME CMD 3137650 ? 00:00:00 df Not sure if the echo ##Moba## is a clue, as I’m using Mobaxterm.
When I try a strace of the parent process it just waits at:
strace -p 3086246 strace: Process 3086246 attached wait4(-1, However the child process returns:
strace -p 3137650 strace: attach: ptrace(PTRACE_SEIZE, 3137650): Operation not permitted For good measure:
pstree -pl 3137650 df(3137650) and:
pstree -pl 3086246 bash(3086246)───df(3137650) and:
ps -fp 3086246 UID PID PPID C STIME TTY TIME CMD myuser 3086246 1 0 16:06 ? 00:00:02 bash -c while [ -d /proc/$PPID ]; do sleep 1;head -v -n 8 /proc/meminfo; head So TTY is a '?', nothing from cronjobs that I could see.
Do the 2 different status of Ss for 3086246 and D for 3137650, nice table here:
D uninterruptible sleep (usually IO) S interruptible sleep (waiting for an event to complete) s is a session leader I also tried a gdb on the 3086246 PID:
Reading symbols from /usr/bin/bash... Reading symbols from /usr/lib/debug/usr/bin/bash-5.0.17-1.fc32.x86_64.debug... Reading symbols from /lib64/libtinfo.so.6... Reading symbols from /usr/lib/debug/usr/lib64/libtinfo.so.6.1-6.1-15.20191109.fc32.x86_64.debug... Reading symbols from /lib64/libdl.so.2... Reading symbols from /usr/lib/debug/usr/lib64/libdl-2.31.so.debug... Reading symbols from /lib64/libc.so.6... Reading symbols from /usr/lib/debug/usr/lib64/libc-2.31.so.debug... Reading symbols from /lib64/ld-linux-x86-64.so.2... Reading symbols from /usr/lib/debug/usr/lib64/ld-2.31.so.debug... 0x00007f8e059ccf3a in __GI___wait4 (pid=pid@entry=-1, stat_loc=stat_loc@entry=0x7fffa494ff10, options=options@entry=0, usage=usage@entry=0x0) at ../sysdeps/unix/sysv/linux/wait4.c:27 27 return SYSCALL_CANCEL (wait4, pid, stat_loc, options, usage); Any thoughts or other debugging commands to try?
Edit: added systemctl status of PIDs per @michael-hampton suggestion:
systemctl status 3086246 -l --no-pager ● session-7778.scope - Session 7778 of user myuser Loaded: loaded (/run/systemd/transient/session-7778.scope; transient) Transient: yes Active: active (abandoned) since Mon 2020-06-29 13:03:32 EDT; 11h ago Tasks: 2 Memory: 316.9M CPU: 11min 46.751s CGroup: /user.slice/user-6105.slice/session-7778.scope ├─3086246 bash -c while [ -d /proc/$PPID ]; do sleep 1;head -v -n 8 /proc/meminfo; head -v -n 2 /proc/stat /proc/version /proc/uptime /proc/loadavg /proc/sys/fs/file-nr /proc/sys/kernel/hostname; tail -v -n 16 /proc/net/dev;echo '==> /proc/df <==';df;echo '==> /proc/who <==';who;echo '==> /proc/end <==';echo '##Moba##'; done └─3137650 df Jun 29 16:26:21 ourserver dracut[3126800]: lrwxrwxrwx 1 root root 20 May 29 14:35 usr/share/unimaps -> /usr/lib/kbd/unimaps Jun 29 16:26:21 ourserver dracut[3126800]: drwxr-xr-x 3 root root 0 May 29 14:35 var Jun 29 16:26:21 ourserver dracut[3126800]: lrwxrwxrwx 1 root root 11 May 29 14:35 var/lock -> ../run/lock Jun 29 16:26:21 ourserver dracut[3126800]: lrwxrwxrwx 1 root root 6 May 29 14:35 var/run -> ../run Jun 29 16:26:21 ourserver dracut[3126800]: drwxr-xr-x 2 root root 0 May 29 14:35 var/tmp Jun 29 16:26:21 ourserver dracut[3126800]: ======================================================================== Jun 29 16:26:21 ourserver dracut[3126800]: *** Creating initramfs image file '/boot/initramfs-5.6.19-300.fc32.x86_64.tmp' done *** Jun 29 16:27:21 ourserver systemd-tmpfiles[3137785]: /usr/lib/tmpfiles.d/lxdm.conf:1: Line references path below legacy directory /var/run/, updating /var/run/lxdm → /run/lxdm; please update the tmpfiles.d/ drop-in file accordingly. Jun 29 16:48:05 ourserver su[2983955]: pam_unix(su:session): session closed for user root Jun 29 16:48:07 ourserver sshd[2983731]: pam_unix(sshd:session): session closed for user myuser and:
systemctl status 3137650 -l --no-pager ● session-7778.scope - Session 7778 of user myuser Loaded: loaded (/run/systemd/transient/session-7778.scope; transient) Transient: yes Active: active (abandoned) since Mon 2020-06-29 13:03:32 EDT; 11h ago Tasks: 2 Memory: 316.9M CPU: 11min 46.751s CGroup: /user.slice/user-6105.slice/session-7778.scope ├─3086246 bash -c while [ -d /proc/$PPID ]; do sleep 1;head -v -n 8 /proc/meminfo; head -v -n 2 /proc/stat /proc/version /proc/uptime /proc/loadavg /proc/sys/fs/file-nr /proc/sys/kernel/hostname; tail -v -n 16 /proc/net/dev;echo '==> /proc/df <==';df;echo '==> /proc/who <==';who;echo '==> /proc/end <==';echo '##Moba##'; done └─3137650 df Jun 29 16:26:21 ourserver dracut[3126800]: lrwxrwxrwx 1 root root 20 May 29 14:35 usr/share/unimaps -> /usr/lib/kbd/unimaps Jun 29 16:26:21 ourserver dracut[3126800]: drwxr-xr-x 3 root root 0 May 29 14:35 var Jun 29 16:26:21 ourserver dracut[3126800]: lrwxrwxrwx 1 root root 11 May 29 14:35 var/lock -> ../run/lock Jun 29 16:26:21 ourserver dracut[3126800]: lrwxrwxrwx 1 root root 6 May 29 14:35 var/run -> ../run Jun 29 16:26:21 ourserver dracut[3126800]: drwxr-xr-x 2 root root 0 May 29 14:35 var/tmp Jun 29 16:26:21 ourserver dracut[3126800]: ======================================================================== Jun 29 16:26:21 ourserver dracut[3126800]: *** Creating initramfs image file '/boot/initramfs-5.6.19-300.fc32.x86_64.tmp' done *** Jun 29 16:27:21 ourserver systemd-tmpfiles[3137785]: /usr/lib/tmpfiles.d/lxdm.conf:1: Line references path below legacy directory /var/run/, updating /var/run/lxdm → /run/lxdm; please update the tmpfiles.d/ drop-in file accordingly. Jun 29 16:48:05 ourserver su[2983955]: pam_unix(su:session): session closed for user root Jun 29 16:48:07 ourserver sshd[2983731]: pam_unix(sshd:session): session closed for user myuser I'm starting to think it might be this bug except that ssh is not slow. Running:
systemctl | grep "abandoned" | grep -e "-[[:digit:]]"
returned several abandoned ssh sessions. Running:
systemctl | grep "abandoned" | grep -e "-[[:digit:]]" | sed "s/.scope.*/.scope/" | xargs systemctl stop
removes all of the abandoned sessions and the 'df' command disappears from 'ps'.

pstree -ps <pid>to see the process's parent(s).pstree -ps 3086246 systemd(1)───bash(3086246)───df(3137650)pstree -ps 3137650 systemd(1)───bash(3086246)───df(3137650)systemctl status <pid>