Chapter 5. Useful SystemTap Scripts

5.1. Network
5.1.1. Network Profiling
5.1.2. Tracing Functions Called in Network Socket Code
5.1.3. Monitoring Incoming TCP Connections
5.1.4. Monitoring TCP Packets
5.1.5. Monitoring Network Packets Drops in Kernel
5.2. Disk
5.2.1. Summarizing Disk Read/Write Traffic
5.2.2. Tracking I/O Time For Each File Read or Write
5.2.3. Track Cumulative IO
5.2.4. I/O Monitoring (By Device)
5.2.5. Monitoring Reads and Writes to a File
5.2.6. Monitoring Changes to File Attributes
5.2.7. Periodically Print I/O Block Time
5.3. Profiling
5.3.1. Counting Function Calls Made
5.3.2. Call Graph Tracing
5.3.3. Determining Time Spent in Kernel and User Space
5.3.4. Monitoring Polling Applications
5.3.5. Tracking Most Frequently Used System Calls
5.3.6. Tracking System Call Volume Per Process
5.4. Identifying Contended User-Space Locks
This chapter enumerates several SystemTap scripts you can use to monitor and investigate different subsystems. All of these scripts are available at /usr/share/systemtap/testsuite/systemtap.examples/ once you install the systemtap-testsuite RPM.

5.1. Network

The following sections showcase scripts that trace network-related functions and build a profile of network activity.

5.1.1. Network Profiling

This section describes how to profile network activity. nettop.stp provides a glimpse into how much network traffic each process is generating on a machine.
nettop.stp
#! /usr/bin/env stapglobal ifxmit, ifrecvglobal ifmergedprobe netdev.transmit{ ifxmit[pid(), dev_name, execname(), uid()] <<< length ifmerged[pid(), dev_name, execname(), uid()] <<< 1}probe netdev.receive{ ifrecv[pid(), dev_name, execname(), uid()] <<< length ifmerged[pid(), dev_name, execname(), uid()] <<< 1}function print_activity(){ printf("%5s %5s %-12s %7s %7s %7s %7s %-15s\n", "PID", "UID", "DEV", "XMIT_PK", "RECV_PK", "XMIT_KB", "RECV_KB", "COMMAND") foreach ([pid, dev, exec, uid] in ifmerged-) { n_xmit = @count(ifxmit[pid, dev, exec, uid]) n_recv = @count(ifrecv[pid, dev, exec, uid]) printf("%5d %5d %-12s %7d %7d %7d %7d %-15s\n", pid, uid, dev, n_xmit, n_recv, @sum(ifxmit[pid, dev, exec, uid])/1024, @sum(ifrecv[pid, dev, exec, uid])/1024, exec) } print("\n") delete ifxmit delete ifrecv delete ifmerged}probe timer.ms(5000), end, error{ print_activity()}
Note that function print_activity() uses the followingexpressions:
n_xmit ? @sum(ifxmit[pid, dev, exec, uid])/1024 : 0n_recv ? @sum(ifrecv[pid, dev, exec, uid])/1024 : 0
These expressions are if/else conditionals.The first statement is a more concise way of writing the followingpsuedo code:
if n_recv != 0 then @sum(ifrecv[pid, dev, exec, uid])/1024else 0 
nettop.stp tracks which processes are generating network traffic on the system, and provides the following information about each process:
  • PID — the ID of the listed process.
  • UID — user ID. A user ID of 0 refers to the root user.
  • DEV — which ethernet device the process used to send / receive data (for example, eth0, eth1)
  • XMIT_PK — number of packets transmitted by the process
  • RECV_PK — number of packets received by the process
  • XMIT_KB — amount of data sent by the process, in kilobytes
  • RECV_KB — amount of data received by the service, in kilobytes
nettop.stp provides network profile sampling every 5 seconds. You can change this setting by editing probe timer.ms(5000) accordingly. Example 5.1, “nettop.stp Sample Output” contains an excerpt of the output from nettop.stp over a 20-second period:

Example 5.1. nettop.stp Sample Output

[...] PID UID DEV XMIT_PK RECV_PK XMIT_KB RECV_KB COMMAND 0 0 eth0 0 5 0 0 swapper11178 0 eth0 2 0 0 0 synergyc PID UID DEV XMIT_PK RECV_PK XMIT_KB RECV_KB COMMAND 2886 4 eth0 79 0 5 0 cups-polld11362 0 eth0 0 61 0 5 firefox 0 0 eth0 3 32 0 3 swapper 2886 4 lo 4 4 0 0 cups-polld11178 0 eth0 3 0 0 0 synergyc PID UID DEV XMIT_PK RECV_PK XMIT_KB RECV_KB COMMAND 0 0 eth0 0 6 0 0 swapper 2886 4 lo 2 2 0 0 cups-polld11178 0 eth0 3 0 0 0 synergyc 3611 0 eth0 0 1 0 0 Xorg PID UID DEV XMIT_PK RECV_PK XMIT_KB RECV_KB COMMAND 0 0 eth0 3 42 0 2 swapper11178 0 eth0 43 1 3 0 synergyc11362 0 eth0 0 7 0 0 firefox 3897 0 eth0 0 1 0 0 multiload-apple[...]

5.1.2. Tracing Functions Called in Network Socket Code

This section describes how to trace functions called from the kernel's net/socket.c file. This task helps you identify, in finer detail, how each process interacts with the network at the kernel level.
socket-trace.stp
#! /usr/bin/env stapprobe kernel.function("*@net/socket.c").call { printf ("%s -> %s\n", thread_indent(1), ppfunc())}probe kernel.function("*@net/socket.c").return { printf ("%s <- %s\n", thread_indent(-1), ppfunc())}
socket-trace.stp is identical to Example 3.6, “thread_indent.stp”, which was earlier used in SystemTap Functions to illustrate how thread_indent() works.

Example 5.2. socket-trace.stp Sample Output

[...]0 Xorg(3611): -> sock_poll3 Xorg(3611): <- sock_poll0 Xorg(3611): -> sock_poll3 Xorg(3611): <- sock_poll0 gnome-terminal(11106): -> sock_poll5 gnome-terminal(11106): <- sock_poll0 scim-bridge(3883): -> sock_poll3 scim-bridge(3883): <- sock_poll0 scim-bridge(3883): -> sys_socketcall4 scim-bridge(3883): -> sys_recv8 scim-bridge(3883): -> sys_recvfrom12 scim-bridge(3883):-> sock_from_file16 scim-bridge(3883):<- sock_from_file20 scim-bridge(3883):-> sock_recvmsg24 scim-bridge(3883):<- sock_recvmsg28 scim-bridge(3883): <- sys_recvfrom31 scim-bridge(3883): <- sys_recv35 scim-bridge(3883): <- sys_socketcall[...]
Example 5.2, “socket-trace.stp Sample Output” contains a 3-second excerpt of the output for socket-trace.stp. For more information about the output of this script as provided by thread_indent(), refer to SystemTap Functions Example 3.6, “thread_indent.stp”.

5.1.3. Monitoring Incoming TCP Connections

This section illustrates how to monitor incoming TCP connections. This task is useful in identifying any unauthorized, suspicious, or otherwise unwanted network access requests in real time.
tcp_connections.stp
#! /usr/bin/env stapprobe begin { printf("%6s %16s %6s %6s %16s\n", "UID", "CMD", "PID", "PORT", "IP_SOURCE")}probe kernel.{function("tcp_accept"),function("inet_csk_accept")}.return? { sock = $return if (sock != 0) printf("%6d %16s %6d %6d %16s\n", uid(), execname(), pid(), inet_get_local_port(sock), inet_get_ip_source(sock))}
While tcp_connections.stp is running, it will print out the following information about any incoming TCP connections accepted by the system in real time:
  • Current UID
  • CMD - the command accepting the connection
  • PID of the command
  • Port used by the connection
  • IP address from which the TCP connection originated

Example 5.3. tcp_connections.stp Sample Output

UID CMD PID PORT IP_SOURCE0 sshd 3165 22 10.64.0.2270 sshd 3165 22 10.64.0.227

5.1.4. Monitoring TCP Packets

This section illustrates how to monitor TCP packets received by the system. This is useful in analyzing network traffic generated by applications running on the system.
tcpdumplike.stp
#! /usr/bin/env stap// A TCP dump like exampleprobe begin, timer.s(1) { printf("-----------------------------------------------------------------\n") printf(" Source IP Dest IP SPort DPort U A P R S F \n") printf("-----------------------------------------------------------------\n")}probe udp.recvmsg /* ,udp.sendmsg */ { printf(" %15s %15s %5d %5d UDP\n", saddr, daddr, sport, dport)}probe tcp.receive { printf(" %15s %15s %5d %5d %d %d %d %d %d %d\n", saddr, daddr, sport, dport, urg, ack, psh, rst, syn, fin)}
While tcpdumplike.stp is running, it will print out the following information about any received TCP packets in real time:
  • Source and destination IP address (saddr,daddr, respectively)
  • Source and destination ports (sport, dport, respectively)
  • Packet flags
To determine the flags used by the packet, tcpdumplike.stp uses the following functions:
  • urg - urgent
  • ack - acknowledgement
  • psh - push
  • rst - reset
  • syn - synchronize
  • fin - finished
The aforementioned functions return 1 or 0 to specify whether the packet uses the corresponding flag.

Example 5.4. tcpdumplike.stp Sample Output

----------------------------------------------------------------- Source IP Dest IP SPort DPort U A P R S F----------------------------------------------------------------- 209.85.229.147 10.0.2.15 80 20373 0 1 1 0 0 0 92.122.126.240 10.0.2.15 80 53214 0 1 0 0 1 0 92.122.126.240 10.0.2.15 80 53214 0 1 0 0 0 0 209.85.229.118 10.0.2.15 80 63433 0 1 0 0 1 0 209.85.229.118 10.0.2.15 80 63433 0 1 0 0 0 0 209.85.229.147 10.0.2.15 80 21141 0 1 1 0 0 0 209.85.229.147 10.0.2.15 80 21141 0 1 1 0 0 0 209.85.229.147 10.0.2.15 80 21141 0 1 1 0 0 0 209.85.229.147 10.0.2.15 80 21141 0 1 1 0 0 0 209.85.229.147 10.0.2.15 80 21141 0 1 1 0 0 0 209.85.229.118 10.0.2.15 80 63433 0 1 1 0 0 0[...]

5.1.5. Monitoring Network Packets Drops in Kernel

The network stack in Linuxcan discard packets for various reasons. Some Linux kernels include atracepoint, kernel.trace("kfree_skb"), which easily tracks where packets are discarded. dropwatch.stp uses kernel.trace("kfree_skb") to tracepacket discards; the script summarizes which locationsdiscard packets every five-second interval.
dropwatch.stp
#! /usr/bin/env stap############################################################# Dropwatch.stp# Author: Neil Horman <nhorman@redhat.com># An example script to mimic the behavior of the dropwatch utility# http://fedorahosted.org/dropwatch############################################################# Array to hold the list of drop points we findglobal locations# Note when we turn the monitor on and offprobe begin { printf("Monitoring for dropped packets\n") }probe end { printf("Stopping dropped packet monitor\n") }# increment a drop counter for every location we drop atprobe kernel.trace("kfree_skb") { locations[$location] <<< 1 }# Every 5 seconds report our drop locationsprobe timer.sec(5){ printf("\n%s\n", ctime(gettimeofday_s())) foreach (l in locations-) { printf("%d packets dropped at %s\n", @count(locations[l]), symdata(l)) } delete locations}
The kernel.trace("kfree_skb") traces which placesin the kernel drop network packets.The kernel.trace("kfree_skb") has two arguments:a pointer to the buffer being freed ($skb)and the location in kernel code thebuffer is being freed ($location).The dropwatch.stp script provides the function containing$location where possible.The information to map $location back to thefunction is not in the instrumentation by default.On SystemTap 1.4 the --all-modules option will include therequired mapping information and the following command can be used to runthe script:
stap --all-modules dropwatch.stp
On older versions of SystemTap you can use the following command to emulate the--all-modules option:
stap -dkernel \`cat /proc/modules | awk 'BEGIN { ORS = " " } {print "-d"$1}'` \dropwatch.stp
Running the dropwatch.stp script 15 seconds would result in output similar inExample 5.5, “dropwatch.stp Sample Output”. The output lists the number of misses foreach tracepoint location with either the function name or the address.

Example 5.5. dropwatch.stp Sample Output

Monitoring for dropped packetsTue Nov 17 00:26:51 20201762 packets dropped at unix_stream_recvmsg4 packets dropped at tun_do_read2 packets dropped at nf_hook_slowTue Nov 17 00:26:56 2020467 packets dropped at unix_stream_recvmsg20 packets dropped at nf_hook_slow6 packets dropped at tun_do_readTue Nov 17 00:27:01 2020446 packets dropped at unix_stream_recvmsg4 packets dropped at tun_do_read4 packets dropped at nf_hook_slowStopping dropped packet monitor
When the script is being compiled on one machine and run onanother the --all-modules and/proc/modules directory are not available; thesymname function will just print out the raw address.To make the raw address of packet drops more meaningful, refer to the/boot/System.map-`uname -r` file. This file lists thestarting addresses for each function, allowing you to map theaddresses in the output of Example 5.5, “dropwatch.stp Sample Output” to a specificfunction name.Given the following snippet of the/boot/System.map-`uname -r` file,the address 0xffffffff8149a8ed maps to the functionunix_stream_recvmsg:
[...]ffffffff8149a420 t unix_dgram_pollffffffff8149a5e0 t unix_stream_recvmsgffffffff8149ad00 t unix_find_other[...]