When trying to set up an OpenBSD router I've run into an apparent routing problem.
I have a 1U machine with 6 gigabit NICs (em0-em5). My ISP provided me with the following:
xx.xx.97.246/28 static WAN IP. xx.xx.97.241 default gateway address. xx.xx.98.192/29 static block that is externally routed to via xx.xx.97.246 This is my setup:
+-------------------+ | em0 +----> ISP's gateway (xx.xx.97.241) | xx.xx.97.246 | +-------------------+ | em2 +----> DMZ network (xx.xx.98.194-198) | xx.xx.98.193/29 | +-------------------+ | em4 +----> LAN2 (172.16.1.0/24) | 172.16.1.1 | [NATs to xx.xx.97.246 address] +-------------------+ | em5 +----> LAN (192.168.1.0/24) | 192.168.1.1 | [NATs to xx.xx.97.246 address] +-------------------+ Forwarding has been enabled in sysctl.conf.
With this setup and some basic PF rules I have been able to get basic Internet access from the LAN networks (em4/5) and I can access mail/DNS servers on the DMZ network from the LAN networks. So far, so good.
The problem is in accessing the DMZ network from the Internet and any host on the DMZ network accessing the Internet (to perform upstream DNS look ups, etc). There does not seem to be any connectivity to the Internet in either direction when a DMZ net address is the endpoint.
I can ping the xx.xx.97.246 WAN address from the DMZ net but I cannot ping the ISP's default gateway address (xx.xx.97.241).
I get an odd trace when trying to traceroute from DMZ net to the WAN IP - it skips the em2 hop and only reports the WAN IP as if it was the next hop. I can trace to em2 and get it's address as the next hop if I specify the em2 address as the destination address. Regardless I cannot ping nor trace to the gateway address - I get a "host unreachable" message.
This is what I have tried so far:
Adding a static route from
xx.xx.98.193to the WAN IP and got "route exists" ok, makes sense. Out of desperation I tried adding a route fromxx.xx.98.193to the gateway address and got "destination unreachable".Adding
xx.xx.98.193as an IP alias to em0 and changing the address ofem2toxx.xx.98.194. This did not seem to have any effect on anything besides breaking all routing through the gateway.Unplugged the DMZ switch from
em2and plugged in a PC configured to static addressxx.xx.98.194then disabled pf withpfctl -dwhich essentially passes everything and takes pf out of the picture. No change.Double-checked the arp table with
arp -aand can see that em2 (it's MAC) is bound to xx.xx.98.193/29. Also checked the routing table withnetstat -ranf inetand it shows thexx.xx.97.241gateway as the default.
At this point I think this is purely a routing problem especially after taking steps 3 & 4. By the way, steps 1 & 2 were reversed and the system rebooted to revert back to the previous config before trying 3 & 4.
I've considered bridging em0 and em2 but it just doesn't seem necessary since all the interfaces are in the same machine and once an em2 host's packet makes it into the router the routing table should have the path out to the default gateway.
I also noticed an odd arp message repeating in /var/log/messages "attempt to overwrite permanent entry xx.xx.98.193 by ". The mac via arp was from the ISP's gateway xx.xx.97.241. Not sure if this is normal "chatty logging" or if it is a symptom of my problem.
Thanks in advance for your time reading and responding,
Rob
UPDATE:
Here is my routing table:
Routing tables Internet: Destination Gateway Flags Refs Use Mtu Prio Iface default xx.xx.97.241 UGS 4 54 - 8 em0 127/8 127.0.0.1 UGRS 0 0 32768 8 lo0 127.0.0.1 127.0.0.1 UHl 1 0 32768 1 lo0 172.16.1/24 172.16.1.1 C 0 0 - 8 em4 172.16.1.1 xx:xx:xx:xx:1b:60 HLl 0 0 - 1 lo0 172.16.1.255 172.16.1.1 Hb 0 0 - 1 em4 192.168.1/24 192.168.1.1 C 0 0 - 8 em5 192.168.1.1 xx:xx:xx:xx:1b:61 HLl 0 0 - 1 lo0 192.168.1.255 192.168.1.1 Hb 0 0 - 1 em5 xx.xx.97.240/28 xx.xx.97.246 UC 1 0 - 8 em0 xx.xx.97.241 xx:xx:xx:xx:fc:d4 UHLc 1 4 - 8 em0 xx.xx.97.246 xx:xx:xx:xx:1b:5c HLl 0 0 - 1 lo0 xx.xx.97.255 xx.xx.97.246 UHb 0 0 - 1 em0 xx.xx.98.192/29 xx.xx.98.193 UC 2 0 - 8 em2 xx.xx.98.193 xx:xx:xx:xx:1b:5e HLl 0 0 - 1 lo0 xx.xx.98.195 xx:xx:xx:xx:d8:33 UHLc 0 4 - 8 em2 xx.xx.98.197 link#3 UHLc 0 4 - 8 em2 xx.xx.98.199 xx.xx.98.193 UHb 0 0 - 1 em2 224/4 127.0.0.1 URS 0 0 32768 8 lo0 I tried a lot of configs last night (primarily because I'm running out of ideas). As per the previous attempts, PF was disabled with pfctl -d to remove it as a variable. Probably the most sane attempt at something was this:
- Unplugged em4 and em5 (the dynamic nets) and removed their hostname.if files from /etc to leave only
em0andem2. - Configured
em0with thexx.xx.98.193address and em2 with thexx.xx.98.194address then a PC connected toem2withxx.xx.98.195.
With that configuration from the PC I could ping up to the 194 address (em2 IF) but not the 193 address (em0 IF). From the OpenBSD system I could ping the 194 and 195 addresses.
It seems like whatever the problem is the breakdown in routing is occurring when em2 tries to reach em0. Also for at least one ping attempt the PC made it up to em0 with 50 loss but consistently failed thereafter.
There must be something silly that I am missing here. My next crazy thing to try will be to enable/add kernel routing logs and compare what is output between successful routing between interfaces and between em0/2. Its going to be a crash course in OpenBSD's networking stack/kernel for that. :/