Explanation
When you put a file in /etc/netns/.../resolv.conf
, it will cause ip netns exec
to bind mount it over /etc/resolv.conf
when setting up the namespace set up.
This usually works, but when an atomic replacement of /etc/resolv.conf
occurs, the inode that the bind mount is on top of changes. I haven't confirmed exactly why this is, but I'd venture to guess rather than overlaying on top of whatever file happens to exist, bind mounting over a file actually attaches to the inode and not just the filename.
When you have something like NetworkManager or resolvconf managing your /etc/resolv.conf
, this will lead to the bind mount mysteriously disappearing when those tools replace the file to do an atomic update.
This is most likely the cause for why your resolv.conf
is being overwritten.
Workarounds
There are a couple approaches to resolving this:
You can disable anything from updating your /etc/resolv.conf, and opt to manage it yourself. (One of the other answers here points out how to disable it with NetworkManager.)
You can (and probably should) force outgoing UDP and TCP connections on port 53 to get sent to your DNS server of choice rather. This will effectively override the /etc/resolv.conf
. It's pretty easy to do this with iptables, doing something like:
ip netns exec "${NAMESPACE}" iptables -t nat -A OUTPUT -p udp --dport 53 -j DNAT --to "${NAMESERVER}" ip netns exec "${NAMESPACE}" iptables -t nat -A OUTPUT -p tcp --dport 53 -j DNAT --to "${NAMESERVER}"
(Where ${NAMESPACE}
is the namespace and ${NAMESERVER}
is the DNS server you wish to force. Note that this breaks failover.)
(Please feel free to update this answer with information about how to do this with other firewall tools like nftables. I don't personally know, so I'd rather not.)
Although this method is not persistent across reboots, at least with iptables, it is part of the state of the network namespace itself, so it is persistent when entering into the netns with multiple ip netns exec
commands.
Troubleshooting
You should make sure that this is actually working correctly, as there are several other ways in which your DNS can leak inside of ip netns exec
. To completely clean up DNS leaks, you will need to ensure that DNS resolution always happens inside of the netns.
systemd-resolved
If you are using systemd-resolved, it may be necessary to ensure it is blocked or disabled inside of the namespace. glibc's choice to use systemd resolved can be influenced by 1. /etc/resolve.conf
, where systemd-resolved may set itself as the resolver using something along the lines of nameserver 127.0.0.1:53
(which will be overridden by ip exec netns
without needing to do anything) or, 2. /etc/nsswitch.conf
, a file read by glibc to determine which NSS modules to use when resolving names. In general, if the hosts:
line contains resolve
, systemd-resolved is in use. If it only contains other entries, it most likely does not. Traditional DNS resolution, using the nameserver at /etc/resolv.conf
, occurs via the dns
module.
One possible workaround would be to add a file at /etc/netns/${NAMESPACE}/nsswitch.conf
with something like:
... hosts: dns ...
(where the ...
segments are your original /etc/nsswitch.conf
settings.)
Unlike /etc/resolv.conf
, this file is not regularly written to, so this bind mount should not go stale.
nscd
If you are not using systemd-resolved, there is one other way in which glibc may apparently ignore your /etc/resolv.conf
and /etc/nsswitch.conf
setup. This is when nscd is running. glibc will try to reach nscd at /var/run/nscd/socket
before it starts loading NSS modules. In NixOS, this is used to avoid needing to pollute the global state configuration of NSS modules into the direct runtime environment of every process; it relies on nscd loading all of the NSS modules and other processes resolving via nscd. In some other systems, nscd may be used to provide caching of resolution.
There are several workarounds:
You can unshare the filesystem namespace and bind mount something, e.g. /var/empty
, over /var/run/nscd
. This will disable nscd
for just one process. Note that this has to be done separately from the netns configuration; it will not persist merely by calling ip netns exec
.
You can use firejail to leverage the power of seccomp-bpf to disallow access to /var/run/nscd/socket
. This might look something like this:
firejail --noprofile --blacklist=/var/run/nscd/socket --netns=${NAMESPACE} --dns=${NAMESERVER} [command]
In many cases, you can just disable nscd globally, as it is not generally needed. This is not generally recommended on NixOS, but is reasonable on most Linux based operating systems.