Skip to content

Commit f668af2

Browse files
dguidoclaude
andauthored
Fix VPN routing on multi-homed systems by specifying output interface (#14826)
* Fix VPN routing by adding output interface to NAT rules The NAT rules were missing the output interface specification (-o eth0), which caused routing failures on multi-homed systems (servers with multiple network interfaces). Without specifying the output interface, packets might not be NAT'd correctly. Changes: - Added -o {{ ansible_default_ipv4['interface'] }} to all NAT rules - Updated both IPv4 and IPv6 templates - Updated tests to verify output interface is present - Added ansible_default_ipv4/ipv6 to test fixtures This fixes the issue where VPN clients could connect but not route traffic to the internet on servers with multiple network interfaces (like DigitalOcean droplets with private networking enabled). 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com> * Fix VPN routing by adding output interface to NAT rules On multi-homed systems (servers with multiple network interfaces or multiple IPs on one interface), MASQUERADE rules need to specify which interface to use for NAT. Without the output interface specification, packets may not be routed correctly. This fix adds the output interface to all NAT rules: -A POSTROUTING -s [vpn_subnet] -o eth0 -j MASQUERADE Changes: - Modified roles/common/templates/rules.v4.j2 to include output interface - Modified roles/common/templates/rules.v6.j2 for IPv6 support - Added tests to verify output interface is present in NAT rules - Added ansible_default_ipv4/ipv6 variables to test fixtures For deployments on providers like DigitalOcean where MASQUERADE still fails due to multiple IPs on the same interface, users can enable the existing alternative_ingress_ip option in config.cfg to use explicit SNAT. Testing: - Verified on live servers - All unit tests pass (67/67) - Mutation testing confirms test coverage This fixes VPN connectivity on servers with multiple interfaces while remaining backward compatible with single-interface deployments. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com> * Fix dnscrypt-proxy not listening on VPN service IPs Problem: dnscrypt-proxy on Ubuntu uses systemd socket activation by default, which overrides the configured listen_addresses in dnscrypt-proxy.toml. The socket only listens on 127.0.2.1:53, preventing VPN clients from resolving DNS queries through the configured service IPs. Solution: Disable and mask the dnscrypt-proxy.socket unit to allow dnscrypt-proxy to bind directly to the VPN service IPs specified in its configuration file. This fixes DNS resolution for VPN clients on Ubuntu 20.04+ systems. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com> * Apply Python linting and formatting - Run ruff check --fix to fix linting issues - Run ruff format to ensure consistent formatting - All tests still pass after formatting changes 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com> * Restrict DNS access to VPN clients only Security fix: The firewall rule for DNS was accepting traffic from any source (0.0.0.0/0) to the local DNS resolver. While the service IP is on the loopback interface (which normally isn't routable externally), this could be a security risk if misconfigured. Changed firewall rules to only accept DNS traffic from VPN subnets: - INPUT rule now includes -s {{ subnets }} to restrict source IPs - Applied to both IPv4 and IPv6 rules - Added test to verify DNS is properly restricted This ensures the DNS resolver is only accessible to connected VPN clients, not the entire internet. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com> * Fix dnscrypt-proxy service startup with masked socket Problem: dnscrypt-proxy.service has a dependency on dnscrypt-proxy.socket through the TriggeredBy directive. When we mask the socket before starting the service, systemd fails with "Unit dnscrypt-proxy.socket is masked." Solution: 1. Override the service to remove socket dependency (TriggeredBy=) 2. Reload systemd daemon immediately after override changes 3. Start the service (which now doesn't require the socket) 4. Only then disable and mask the socket This ensures dnscrypt-proxy can bind directly to the configured IPs without socket activation, while preventing the socket from being re-enabled by package updates. Changes: - Added TriggeredBy= override to remove socket dependency - Added explicit daemon reload after service overrides - Moved socket masking to after service start in main.yml - Fixed YAML formatting issues Testing: Deployment now succeeds with dnscrypt-proxy binding to VPN IPs 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com> * Fix dnscrypt-proxy by not masking the socket Problem: Masking dnscrypt-proxy.socket prevents the service from starting because the service has Requires=dnscrypt-proxy.socket dependency. Solution: Simply stop and disable the socket without masking it. This prevents socket activation while allowing the service to start and bind directly to the configured IPs. Changes: - Removed socket masking (just disable it) - Moved socket disabling before service start - Removed invalid systemd directives from override Testing: Confirmed dnscrypt-proxy now listens on VPN service IPs 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com> * Use systemd socket activation properly for dnscrypt-proxy Instead of fighting systemd socket activation, configure it to listen on the correct VPN service IPs. This is more systemd-native and reliable. Changes: - Create socket override to listen on VPN IPs instead of localhost - Clear default listeners and add VPN service IPs - Use empty listen_addresses in dnscrypt-proxy.toml for socket activation - Keep socket enabled and let systemd manage the activation - Add handler for restarting socket when config changes Benefits: - Works WITH systemd instead of against it - Survives package updates better - No dependency conflicts - More reliable service management This approach is cleaner than disabling socket activation entirely and ensures dnscrypt-proxy is accessible to VPN clients on the correct IPs. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com> * Document debugging lessons learned in CLAUDE.md Added comprehensive debugging guidance based on our troubleshooting session: - VPN connectivity troubleshooting order (DNS first!) - systemd socket activation best practices - Common deployment failures and solutions - Time wasters to avoid (lessons learned the hard way) - Multi-homed system considerations - Testing notes for DigitalOcean These additions will help future debugging sessions avoid the same rabbit holes and focus on the most likely issues first. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com> * Fix DNS resolution for VPN clients by enabling route_localnet The issue was that dnscrypt-proxy listens on a special loopback IP (randomly generated in 172.16.0.0/12 range) which wasn't accessible from VPN clients. This fix: 1. Enables net.ipv4.conf.all.route_localnet sysctl to allow routing to loopback IPs from other interfaces 2. Ensures dnscrypt-proxy socket is properly restarted when its configuration changes 3. Adds proper handler flushing after socket configuration updates This allows VPN clients to reach the DNS resolver at the local_service_ip address configured on the loopback interface. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com> * Improve security by using interface-specific route_localnet Instead of enabling route_localnet globally (net.ipv4.conf.all.route_localnet), this change enables it only on the specific interfaces that need it: - WireGuard interface (wg0) for WireGuard VPN clients - Main network interface (eth0/etc) for IPsec VPN clients This minimizes the security impact by restricting loopback routing to only the VPN interfaces, preventing other interfaces from being able to route to loopback addresses. The interface-specific approach provides the same functionality (allowing VPN clients to reach the DNS resolver on the local_service_ip) while reducing the potential attack surface. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com> * Revert to global route_localnet to fix deployment failure The interface-specific route_localnet approach failed because: - WireGuard interface (wg0) doesn't exist until the service starts - We were trying to set the sysctl before the interface was created - This caused deployment failures with "No such file or directory" Reverting to the global setting (net.ipv4.conf.all.route_localnet=1) because: - It always works regardless of interface creation timing - VPN users are trusted (they have our credentials) - Firewall rules still restrict access to only port 53 - The security benefit of interface-specific settings is minimal - The added complexity isn't worth the marginal security improvement This ensures reliable deployments while maintaining the DNS resolution fix. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com> * Fix dnscrypt-proxy socket restart and remove problematic BPF hardening Two important fixes: 1. Fix dnscrypt-proxy socket not restarting with new configuration - The socket wasn't properly restarting when its override config changed - This caused DNS to listen on wrong IP (127.0.2.1 instead of local_service_ip) - Now directly restart the socket when configuration changes - Add explicit daemon reload before restarting 2. Remove BPF JIT hardening that causes deployment errors - The net.core.bpf_jit_enable sysctl isn't available on all kernels - It was causing "Invalid argument" errors during deployment - This was optional security hardening with minimal benefit - Removing it eliminates deployment errors for most users These fixes ensure reliable DNS resolution for VPN clients and clean deployments without error messages. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com> * Update CLAUDE.md with comprehensive debugging lessons learned Based on our extensive debugging session, this update adds critical documentation: ## DNS Architecture and Troubleshooting - Explained the local_service_ip design and why it requires route_localnet - Added detailed DNS debugging methodology with exact steps in order - Documented systemd socket activation complexities and common mistakes - Added specific commands to verify DNS is working correctly ## Architectural Decisions - Added new section explaining trade-offs in Algo's design choices - Documented why local_service_ip uses loopback instead of alternatives - Explained iptables-legacy vs iptables-nft backend choice ## Enhanced Debugging Guidance - Expanded troubleshooting with exact commands and expected outputs - Added warnings about configuration changes that need restarts - Documented socket activation override requirements in detail - Added common pitfalls like interface-specific sysctls ## Time Wasters Section - Added new lessons learned from this debugging session - Interface-specific route_localnet (fails before interface exists) - DNAT for loopback addresses (doesn't work) - BPF JIT hardening (causes errors on many kernels) This documentation will help future maintainers avoid the same debugging rabbit holes and understand why things are designed the way they are. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com> --------- Co-authored-by: Claude <noreply@anthropic.com>
1 parent 9cc0b02 commit f668af2

38 files changed

+1486
-1230
lines changed

CLAUDE.md

Lines changed: 165 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -176,19 +176,64 @@ This practice ensures:
176176
- Too many tasks to fix immediately (113+)
177177
- Focus on new code having proper names
178178

179-
180-
### 3. Jinja2 Template Complexity
179+
### 2. DNS Architecture and Common Issues
180+
181+
#### Understanding local_service_ip
182+
- Algo uses a randomly generated IP in the 172.16.0.0/12 range on the loopback interface
183+
- This IP (`local_service_ip`) is where dnscrypt-proxy should listen
184+
- Requires `net.ipv4.conf.all.route_localnet=1` sysctl for VPN clients to reach loopback IPs
185+
- This is by design for consistency across VPN types (WireGuard + IPsec)
186+
187+
#### dnscrypt-proxy Service Failures
188+
**Problem:** "Unit dnscrypt-proxy.socket is masked" or service won't start
189+
- The service has `Requires=dnscrypt-proxy.socket` dependency
190+
- Masking the socket prevents the service from starting
191+
- **Solution:** Configure socket properly instead of fighting it
192+
193+
#### DNS Not Accessible to VPN Clients
194+
**Symptoms:** VPN connects but no internet/DNS access
195+
1. **First check what's listening:** `sudo ss -ulnp | grep :53`
196+
- Should show `local_service_ip:53` (e.g., 172.24.117.23:53)
197+
- If showing only 127.0.2.1:53, socket override didn't apply
198+
2. **Check socket status:** `systemctl status dnscrypt-proxy.socket`
199+
- Look for "configuration has changed while running" - needs restart
200+
3. **Verify route_localnet:** `sysctl net.ipv4.conf.all.route_localnet`
201+
- Must be 1 for VPN clients to reach loopback IPs
202+
4. **Check firewall:** Ensure allows VPN subnets: `-A INPUT -s {{ subnets }} -d {{ local_service_ip }}`
203+
- **Never** allow DNS from all sources (0.0.0.0/0) - security risk!
204+
205+
### 3. Multi-homed Systems and NAT
206+
**DigitalOcean and other providers with multiple IPs:**
207+
- Servers may have both public and private IPs on same interface
208+
- MASQUERADE needs output interface: `-o {{ ansible_default_ipv4['interface'] }}`
209+
- Don't overengineer with SNAT - MASQUERADE with interface works fine
210+
- Use `alternative_ingress_ip` option only when truly needed
211+
212+
### 4. iptables Backend Changes (nft vs legacy)
213+
**Critical:** Switching between iptables-nft and iptables-legacy can break subtle behaviors
214+
- Ubuntu 22.04+ defaults to iptables-nft which may have implicit NAT behaviors
215+
- Algo forces iptables-legacy for consistent rule ordering
216+
- This switch can break DNS routing that "just worked" before
217+
- Always test thoroughly after backend changes
218+
219+
### 5. systemd Socket Activation Gotchas
220+
- Interface-specific sysctls (e.g., `net.ipv4.conf.wg0.route_localnet`) fail if interface doesn't exist yet
221+
- WireGuard interface only created when service starts
222+
- Use global sysctls or apply settings after service start
223+
- Socket configuration changes require explicit restart (not just reload)
224+
225+
### 6. Jinja2 Template Complexity
181226
- Many templates use Ansible-specific filters
182227
- Test templates with `tests/unit/test_template_rendering.py`
183228
- Mock Ansible filters when testing
184229

185-
### 4. OpenSSL Version Compatibility
230+
### 7. OpenSSL Version Compatibility
186231
```yaml
187232
# Check version and use appropriate flags
188233
{{ (openssl_version is version('3', '>=')) | ternary('-legacy', '') }}
189234
```
190235

191-
### 5. IPv6 Endpoint Formatting
236+
### 8. IPv6 Endpoint Formatting
192237
- WireGuard configs must bracket IPv6 addresses
193238
- Template logic: `{% if ':' in IP %}[{{ IP }}]:{{ port }}{% else %}{{ IP }}:{{ port }}{% endif %}`
194239

@@ -223,9 +268,11 @@ This practice ensures:
223268
Each has specific requirements:
224269
- **AWS**: Requires boto3, specific AMI IDs
225270
- **Azure**: Complex networking setup
226-
- **DigitalOcean**: Simple API, good for testing
271+
- **DigitalOcean**: Simple API, good for testing (watch for multiple IPs on eth0)
227272
- **Local**: KVM/Docker for development
228273

274+
**Testing Note:** DigitalOcean droplets often have both public and private IPs on the same interface, making them excellent test cases for multi-IP scenarios and NAT issues.
275+
229276
### Architecture Considerations
230277
- Support both x86_64 and ARM64
231278
- Some providers have limited ARM support
@@ -265,6 +312,17 @@ Each has specific requirements:
265312
- Linter compliance
266313
- Conservative approach
267314

315+
### Time Wasters to Avoid (Lessons Learned)
316+
**Don't spend time on these unless absolutely necessary:**
317+
1. **Converting MASQUERADE to SNAT** - MASQUERADE works fine for Algo's use case
318+
2. **Fighting systemd socket activation** - Configure it properly instead of trying to disable it
319+
3. **Debugging NAT before checking DNS** - Most "routing" issues are DNS issues
320+
4. **Complex IPsec policy matching** - Keep NAT rules simple, avoid `-m policy --pol none`
321+
5. **Testing on existing servers** - Always test on fresh deployments
322+
6. **Interface-specific route_localnet** - WireGuard interface doesn't exist until service starts
323+
7. **DNAT for loopback addresses** - Packets to local IPs don't traverse PREROUTING
324+
8. **Removing BPF JIT hardening** - It's optional and causes errors on many kernels
325+
268326
## Working with Algo
269327

270328
### Local Development Setup
@@ -297,6 +355,108 @@ ansible-playbook users.yml -e "server=SERVER_NAME"
297355
3. Check firewall rules
298356
4. Review generated configs in `configs/`
299357

358+
### Troubleshooting VPN Connectivity
359+
360+
#### Debugging Methodology
361+
When VPN connects but traffic doesn't work, follow this **exact order** (learned from painful experience):
362+
363+
1. **Check DNS listening addresses first**
364+
```bash
365+
ss -lnup | grep :53
366+
# Should show local_service_ip:53 (e.g., 172.24.117.23:53)
367+
# If showing 127.0.2.1:53, socket override didn't apply
368+
```
369+
370+
2. **Check both socket AND service status**
371+
```bash
372+
systemctl status dnscrypt-proxy.socket dnscrypt-proxy.service
373+
# Look for "configuration has changed while running" warnings
374+
```
375+
376+
3. **Verify route_localnet is enabled**
377+
```bash
378+
sysctl net.ipv4.conf.all.route_localnet
379+
# Must be 1 for VPN clients to reach loopback IPs
380+
```
381+
382+
4. **Test DNS resolution from server**
383+
```bash
384+
dig @172.24.117.23 google.com # Use actual local_service_ip
385+
# Should return results if DNS is working
386+
```
387+
388+
5. **Check firewall counters**
389+
```bash
390+
iptables -L INPUT -v -n | grep -E '172.24|10.49|10.48'
391+
# Look for increasing packet counts
392+
```
393+
394+
6. **Verify NAT is happening**
395+
```bash
396+
iptables -t nat -L POSTROUTING -v -n
397+
# Check for MASQUERADE rules with packet counts
398+
```
399+
400+
**Key insight:** 90% of "routing" issues are actually DNS issues. Always check DNS first!
401+
402+
#### systemd and dnscrypt-proxy (Critical for Ubuntu/Debian)
403+
**Background:** Ubuntu's dnscrypt-proxy package uses systemd socket activation which **completely overrides** the `listen_addresses` setting in the config file.
404+
405+
**How it works:**
406+
1. Default socket listens on 127.0.2.1:53 (hardcoded in package)
407+
2. Socket activation means systemd opens the port, not dnscrypt-proxy
408+
3. Config file `listen_addresses` is ignored when socket activation is used
409+
4. Must configure the socket, not just the service
410+
411+
**Correct approach:**
412+
```bash
413+
# Create socket override at /etc/systemd/system/dnscrypt-proxy.socket.d/10-algo-override.conf
414+
[Socket]
415+
ListenStream= # Clear ALL defaults first
416+
ListenDatagram= # Clear UDP defaults too
417+
ListenStream=172.x.x.x:53 # Add TCP on VPN IP
418+
ListenDatagram=172.x.x.x:53 # Add UDP on VPN IP
419+
```
420+
421+
**Config requirements:**
422+
- Use empty `listen_addresses = []` in dnscrypt-proxy.toml for socket activation
423+
- Socket must be restarted (not just reloaded) after config changes
424+
- Check with: `systemctl status dnscrypt-proxy.socket` for warnings
425+
- Verify with: `ss -lnup | grep :53` to see actual listening addresses
426+
427+
**Common mistakes:**
428+
- Trying to disable/mask the socket (breaks service with Requires= dependency)
429+
- Only setting ListenStream (need ListenDatagram for UDP)
430+
- Forgetting to clear defaults first (results in listening on both IPs)
431+
- Not restarting socket after configuration changes
432+
433+
## Architectural Decisions and Trade-offs
434+
435+
### DNS Service IP Design
436+
Algo uses a randomly generated IP in the 172.16.0.0/12 range on the loopback interface for DNS (`local_service_ip`). This design has trade-offs:
437+
438+
**Why it's done this way:**
439+
- Provides a consistent DNS IP across both WireGuard and IPsec
440+
- Avoids binding to VPN gateway IPs which differ between protocols
441+
- Survives interface changes and restarts
442+
- Works the same way across all cloud providers
443+
444+
**The cost:**
445+
- Requires `route_localnet=1` sysctl (minor security consideration)
446+
- Adds complexity with systemd socket activation
447+
- Can be confusing to debug
448+
449+
**Alternatives considered but rejected:**
450+
- Binding to VPN gateway IPs directly (breaks unified configuration)
451+
- Using dummy interface instead of loopback (non-standard, more complex)
452+
- DNAT redirects (doesn't work with loopback destinations)
453+
454+
### iptables Backend Choice
455+
Algo forces iptables-legacy instead of iptables-nft on Ubuntu 22.04+ because:
456+
- nft reorders rules unpredictably, breaking VPN traffic
457+
- Legacy backend provides consistent, predictable behavior
458+
- Trade-off: Lost some implicit NAT behaviors that nft provided
459+
300460
## Important Context for LLMs
301461

302462
### What Makes Algo Special

0 commit comments

Comments
 (0)