52

I have a process (dbus-daemon) which has many open connection over UNIX sockets. One of these connections is fd #36:

=$ ps uw -p 23284 USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND depesz 23284 0.0 0.0 24680 1772 ? Ss 15:25 0:00 /bin/dbus-daemon --fork --print-pid 5 --print-address 7 --session =$ ls -l /proc/23284/fd/36 lrwx------ 1 depesz depesz 64 2011-03-28 15:32 /proc/23284/fd/36 -> socket:[1013410] =$ netstat -nxp | grep 1013410 (Not all processes could be identified, non-owned process info will not be shown, you would have to be root to see it all.) unix 3 [ ] STREAM CONNECTED 1013410 23284/dbus-daemon @/tmp/dbus-3XDU4PYEzD =$ netstat -nxp | grep dbus-3XDU4PYEzD unix 3 [ ] STREAM CONNECTED 1013953 23284/dbus-daemon @/tmp/dbus-3XDU4PYEzD unix 3 [ ] STREAM CONNECTED 1013825 23284/dbus-daemon @/tmp/dbus-3XDU4PYEzD unix 3 [ ] STREAM CONNECTED 1013726 23284/dbus-daemon @/tmp/dbus-3XDU4PYEzD unix 3 [ ] STREAM CONNECTED 1013471 23284/dbus-daemon @/tmp/dbus-3XDU4PYEzD unix 3 [ ] STREAM CONNECTED 1013410 23284/dbus-daemon @/tmp/dbus-3XDU4PYEzD unix 3 [ ] STREAM CONNECTED 1012325 23284/dbus-daemon @/tmp/dbus-3XDU4PYEzD unix 3 [ ] STREAM CONNECTED 1012302 23284/dbus-daemon @/tmp/dbus-3XDU4PYEzD unix 3 [ ] STREAM CONNECTED 1012289 23284/dbus-daemon @/tmp/dbus-3XDU4PYEzD unix 3 [ ] STREAM CONNECTED 1012151 23284/dbus-daemon @/tmp/dbus-3XDU4PYEzD unix 3 [ ] STREAM CONNECTED 1011957 23284/dbus-daemon @/tmp/dbus-3XDU4PYEzD unix 3 [ ] STREAM CONNECTED 1011937 23284/dbus-daemon @/tmp/dbus-3XDU4PYEzD unix 3 [ ] STREAM CONNECTED 1011900 23284/dbus-daemon @/tmp/dbus-3XDU4PYEzD unix 3 [ ] STREAM CONNECTED 1011775 23284/dbus-daemon @/tmp/dbus-3XDU4PYEzD unix 3 [ ] STREAM CONNECTED 1011771 23284/dbus-daemon @/tmp/dbus-3XDU4PYEzD unix 3 [ ] STREAM CONNECTED 1011769 23284/dbus-daemon @/tmp/dbus-3XDU4PYEzD unix 3 [ ] STREAM CONNECTED 1011766 23284/dbus-daemon @/tmp/dbus-3XDU4PYEzD unix 3 [ ] STREAM CONNECTED 1011663 23284/dbus-daemon @/tmp/dbus-3XDU4PYEzD unix 3 [ ] STREAM CONNECTED 1011635 23284/dbus-daemon @/tmp/dbus-3XDU4PYEzD unix 3 [ ] STREAM CONNECTED 1011627 23284/dbus-daemon @/tmp/dbus-3XDU4PYEzD unix 3 [ ] STREAM CONNECTED 1011540 23284/dbus-daemon @/tmp/dbus-3XDU4PYEzD unix 3 [ ] STREAM CONNECTED 1011480 23284/dbus-daemon @/tmp/dbus-3XDU4PYEzD unix 3 [ ] STREAM CONNECTED 1011349 23284/dbus-daemon @/tmp/dbus-3XDU4PYEzD unix 3 [ ] STREAM CONNECTED 1011312 23284/dbus-daemon @/tmp/dbus-3XDU4PYEzD unix 3 [ ] STREAM CONNECTED 1011284 23284/dbus-daemon @/tmp/dbus-3XDU4PYEzD unix 3 [ ] STREAM CONNECTED 1011250 23284/dbus-daemon @/tmp/dbus-3XDU4PYEzD unix 3 [ ] STREAM CONNECTED 1011231 23284/dbus-daemon @/tmp/dbus-3XDU4PYEzD unix 3 [ ] STREAM CONNECTED 1011155 23284/dbus-daemon @/tmp/dbus-3XDU4PYEzD unix 3 [ ] STREAM CONNECTED 1011061 23284/dbus-daemon @/tmp/dbus-3XDU4PYEzD unix 3 [ ] STREAM CONNECTED 1011049 23284/dbus-daemon @/tmp/dbus-3XDU4PYEzD unix 3 [ ] STREAM CONNECTED 1011035 23284/dbus-daemon @/tmp/dbus-3XDU4PYEzD unix 3 [ ] STREAM CONNECTED 1011013 23284/dbus-daemon @/tmp/dbus-3XDU4PYEzD unix 3 [ ] STREAM CONNECTED 1010961 23284/dbus-daemon @/tmp/dbus-3XDU4PYEzD unix 3 [ ] STREAM CONNECTED 1010945 23284/dbus-daemon @/tmp/dbus-3XDU4PYEzD 

Based on number connections, I assume that dbus-daemon is actually server. Which is OK. But how can I find which process is connected to it - using the connection that is 36th file handle in dbus-launcher? Tried lsof and even greps on /proc/net/unix but I can't figure out a way to find the client process.

1

7 Answers 7

28

Quite recently I stumbled upon a similar problem. I was shocked to find out that there are cases when this might not be possible. I dug up a comment from the creator of lsof (Vic Abell) where he pointed out that this depends heavily on unix socket implementation. Sometimes so called "endpoint" information for socket is available and sometimes not. Unfortunatelly it is impossible in Linux as he points out.

On Linux, for example, where lsof must use /proc/net/unix, all UNIX domain sockets have a bound path, but no endpoint information. Often there is no bound path. That often makes it impossible to determine the other endpoint, but it is a result of the Linux /proc file system implementation.

If you look at /proc/net/unix you can see for yourself, that (at least on my system) he is absolutelly right. I'm still shocked, because I find such feature essential while tracking server problems.

3
  • Reference: groups.google.com/forum/#!topic/comp.unix.admin/iZLsq5dHdyI Commented Apr 24, 2014 at 8:04
  • Note that /proc/net/unix WILL tell you the target file of a random domain socket reference you've dug out of /proc/.../fd/. Commented Jan 15, 2017 at 11:50
  • to add a bit of technical detail: STREAM client unix domain sockets do not require binding to a path to be able to receive responses (unlike datagrams in the unix domain). They're referred to as 'nameless' in this case. Commented Oct 15, 2022 at 21:02
30

This answer is for Linux only.

Update for Linux 3.3: As Zulakis wrote in a separate answer (+1 that), you can use ss from iproute2 to get a pair of inode numbers for each socket connection identifying local end and peer. This appears to be based on the same machinery as sock_diag(7) with the UNIX_DIAG_PEER attribute identifying the peer. An answer by Totor over at Unix & Linux Stack Exchange links to the relevant commits in kernel and iproute2 and also mentions the need for the UNIX_DIAG kernel config setting.

Original answer for Linux pre 3.3 follows.

Based on an answer from the Unix & Linux Stack Exchange, I successfully identified the other end of a unix domain socket using in-kernel data structures, accessed using gdb and /proc/kcore. You need to enable the CONFIG_DEBUG_INFO and CONFIG_PROC_KCORE kernel options.

You can use lsof to get the kernel address of the socket, which takes the form of a pointer, e.g. 0xffff8803e256d9c0. That number is actually the address of the relevant in-kernel memory structure or type struct unix_sock. That structure has a field called peer which points at the other end of the socket. So the commands

# gdb /usr/src/linux/vmlinux /proc/kcore (gdb) p ((struct unix_sock*)0xffff8803e256d9c0)->peer 

will print the address of the other end of the connection. You can grep the output of lsof -U for that number to identify the process and file descriptor number of that other end.

Some distributions seem to provide kernel debug symbols as a separate package, which would take the place of the vmlinux file in the above command.

2
  • This looks interesting, but requirement to recompile kernel seems to be an overkill. I'm thinking that perhaps it would be possible to do it, without hand-made kernel, and without using gdb, just by peeking at values in kcore and doing some "manual" decoding of values. Commented Aug 16, 2012 at 9:24
  • 3
    @depesz, all you need to know is the offset of the peer member in the unix_sock structure. On my x86_64 system, that offset is 656 bytes, so I could obtain that other end using p ((void**)0xffff8803e256d9c0)[0x52]. You still need CONFIG_PROC_KCORE, obviously. Commented Sep 2, 2012 at 9:11
20

Actually, ss from iproute2 (replacement for netstat, ifconfig, etc.) can show this information.

Here is an example showing an ssh-agent unix domain socket to which a ssh process has connected:

$ sudo ss -a --unix -p Netid State Recv-Q Send-Q Local Address:Port Peer Address:Port u_str ESTAB 0 0 /tmp/ssh-XxnMh2MdLBxo/agent.27402 651026 * 651642 users:(("ssh-agent",pid=27403,fd=4) u_str ESTAB 0 0 * 651642 * 651026 users:(("ssh",pid=2019,fd=4)) 
1
  • Hmm. Interesting... I had missed that the "Address:Port" columns can be matched, even though the "Peer" column is totally useless for unix domain sockets. Commented Oct 26, 2016 at 23:57
10

Unix sockets usually are assigned numbers in pairs, and are usually consecutive. So the pair for you would likely be 1013410+/-1. See which of those two exists and guess at the culprit.

10

I wrote a tool which uses MvG's gdb method to reliably get socket peer information, kernel debug symbols not needed.

To get the process connected to a given socket, pass it the inode number:

# socket_peer 1013410 3703 thunderbird 

To find out for all processes at once use netstat_unix, it adds a column to netstat's output:

# netstat_unix Proto RefCnt Flags Type State I-Node PID/Program name Peer PID/Program name Path unix 3 [ ] STREAM CONNECTED 6825 982/Xorg 1497/compiz /tmp/.X11-unix/X0 unix 3 [ ] STREAM CONNECTED 6824 1497/compiz 982/Xorg unix 3 [ ] SEQPACKET CONNECTED 207142 3770/chromium-brows 17783/UMA-Session-R unix 3 [ ] STREAM CONNECTED 204903 1523/pulseaudio 3703/thunderbird unix 3 [ ] STREAM CONNECTED 204902 3703/thunderbird 1523/pulseaudio unix 3 [ ] STREAM CONNECTED 204666 1523/pulseaudio 3703/thunderbird ... 

Try netstat_unix --dump if you need output that's easy to parse.
See https://github.com/lemonsqueeze/unix_sockets_peers for details.

For info, the inode +1/-1 hack isn't reliable. It works most of the time but will fail or (worse) return the wrong socket if you're out of luck.

0
1

Edit your system.conf

In this file you could add more stuff for debugging purpose.

File location: /etc/dbus-1/system.conf

For debugging purpose, you can edit your system.conf to allow eavesdropping:

  1. replace the policy section by:

    <policy context="default">

    <!-- Allow everything to be sent -->

    <allow send_destination="*" eavesdrop="true"/>

    <!-- Allow everything to be received -->

    <allow eavesdrop="true"/>

    <!-- Allow anyone to own anything -->

    <allow own="*"/>

    <!-- XXX: Allow all users to connect -->

    <allow user="*"/> </policy>

  2. Remove the includedir line: system.d

    <includedir>system.d</includedir>

Source: http://old.nabble.com/dbus-send-error-td29893862.html


Some other useful stuff regarding unix sockets

The simplest way to figure out what's happening on the bus is to run the dbus-monitor program, which comes with the D-Bus package

Also you can try to use dbus-cleanup-sockets to clean up leftover sockets.

Following command will show you which process is connected how many times to dbus sockets based on netstat output:

sudo netstat -nap | grep dbus | grep CONNECTED | awk '{print $8}' | sort | uniq -c 

(tested on Ubuntu)

Hardcore way: This command will find manually the processes from /proc and show which are using the most connections (all type of sockets):

ls -lR */fd/* | grep socket | sed -r "s@([0-9{1}]+)/fd/@_\1_@g" | awk -F_ '{print $2}' | uniq -c | sort -n | awk '{print $1" "$2; print system("ps "$2"|tail -n1")}' 

Example output:

(count, PID and the next line contains details about the process)

25 3732 3732 ? Ss 0:38 /usr/bin/wineserver 89 1970 1970 ? Ss 0:02 //bin/dbus-daemon --fork --print-pid 5 --print-address 7 --session 

(tested on Ubuntu)

Have fun.


See also related articles for the reference:

0

unix socket may have differece path at server or client side.
some tool to check the connection.
//seems need kernel >= 3.3

ref: https://stackoverflow.com/questions/15100824

run test

$ nc -l -U -u /tmp/test.sock & $ nc -Uu /tmp/test.sock 

A. lsof +E

$ lsof +E -aUc 'nc' COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME nc 681642 chen 3u unix 0x0000000000000000 0t0 5801126 /tmp/test.sock type=DGRAM ->INO=5800126 681673,nc,4u nc 681673 chen 4u unix 0x0000000000000000 0t0 5800126 /tmp/nc.XXXXSFNvr9 type=DGRAM ->INO=5801126 681642,nc,3u 
  • +E // show endpoint info
  • -a // logic AND all filter (default OR)
  • -U // show unix sock
  • -c cmd // filter by cmd

//output explain

PID NODE path peer_node peer_pid 681642 5801126 /tmp/test.sock type=DGRAM ->INO=5800126 681673 681673 5800126 /tmp/nc.XXXXSFNvr9 type=DGRAM ->INO=5801126 681642 

the sock path not same,
but node, peer_node is corresponding.
//so grep the node num can find both side.

B. ss -x

$ ss -xp ' src = *test.sock || src = *nc.* ' Netid State Recv-Q Send-Q Local Address:Port Peer Address:Port Process u_dgr ESTAB 0 0 /tmp/test.sock 5801126 * 5800126 users:(("nc",pid=681642,fd=3)) u_dgr ESTAB 0 0 /tmp/nc.XXXXSFNvr9 5800126 * 5801126 users:(("nc",pid=681673,fd=4)) 
  • -x // show unix sock
  • -p // show process info

similar to lsof +E output, but it show the Node as Port

You must log in to answer this question.