16

(Rewriting most of this question since a lot of my original tests are irrelevant in light of new information)

I'm having issues with Server 2012R2 DNS servers. The biggest side effect of these issues is Exchange emails not going through. Exchange queries for AAAA records before trying A records. When it sees SERVFAIL for the AAAA record, it doesn't even try A records, it just gives up.

For some domains, when querying against my active directory DNS servers, I get SERVFAIL instead of NOERROR with no results.

I have tried this from several different Server 2012R2 domain controllers that are running DNS. One of them is an entirely separate domain, on a different network behind a different firewall and internet connection.

Two addresses that I know cause this problem are smtpgw1.gov.on.ca and mxmta.owm.bell.net

I've been using dig on a linux machine to test this (192.168.5.5 is my domain controller):

grant@linuxbox:~$ dig @192.168.5.5 smtpgw1.gov.on.ca -t AAAA ; <<>> DiG 9.9.5-3ubuntu0.5-Ubuntu <<>> @192.168.5.5 smtpgw1.gov.on.ca -t AAAA ; (1 server found) ;; global options: +cmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: SERVFAIL, id: 56328 ;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 1 ;; OPT PSEUDOSECTION: ; EDNS: version: 0, flags:; udp: 4000 ;; QUESTION SECTION: ;smtpgw1.gov.on.ca. IN AAAA ;; Query time: 90 msec ;; SERVER: 192.168.5.5#53(192.168.5.5) ;; WHEN: Wed Oct 21 14:09:10 EDT 2015 ;; MSG SIZE rcvd: 46 

But queries against a public domain controller work as expected:

grant@home-ssh:~$ dig @4.2.2.1 smtpgw1.gov.on.ca -t AAAA ; <<>> DiG 9.9.5-3ubuntu0.5-Ubuntu <<>> @4.2.2.1 smtpgw1.gov.on.ca -t AAAA ; (1 server found) ;; global options: +cmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 269 ;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 1 ;; OPT PSEUDOSECTION: ; EDNS: version: 0, flags:; udp: 8192 ;; QUESTION SECTION: ;smtpgw1.gov.on.ca. IN AAAA ;; Query time: 136 msec ;; SERVER: 4.2.2.1#53(4.2.2.1) ;; WHEN: Wed Oct 21 14:11:19 EDT 2015 ;; MSG SIZE rcvd: 46 

As I said, I've tried this on two different networks and domains. One is a brand new domain, which definitely has all default settings for DNS. The other has been migrated to Server 2012, so some old settings from 2003/2008 may have carried over. I get the same results on both of them.

Disabling EDNS with dmscnd /config /enableednsprobes 0 fixes it. I see many search results about EDNS being a problem in Server 2003, but not much that matches what I'm seeing in Server 2012. Neither firewall has a problem with EDNS. Disabling EDNS should just be a temporary workaround though - it prevents the use of DNSSEC, and might cause other issues.

I have also seen some posts about issues with Server 2008R2 and EDNS, but those same posts say things are fixed in Server 2012, so it should work properly.

I have also tried enabling the debug log for DNS. I can see the packets that I expected, but it doesn't give me much insight as to why it's returning SERVFAIL. Here is the relevant portions of the DNS server debug log:

First packet - query from client to my DNS server

 10/16/2015 9:42:29 AM 0974 PACKET 000000EFF1BF01A0 UDP Rcv 172.16.0.254 a61e Q [2001 D NOERROR] AAAA (7)smtpgw1(3)gov(2)on(2)ca(0) UDP question info at 000000EFF1BF01A0 Socket = 508 Remote addr 172.16.0.254, port 50764 Time Query=4556080, Queued=0, Expire=0 Buf length = 0x0fa0 (4000) Msg length = 0x002e (46) Message: XID 0xa61e Flags 0x0120 QR 0 (QUESTION) OPCODE 0 (QUERY) AA 0 TC 0 RD 1 RA 0 Z 0 CD 0 AD 1 RCODE 0 (NOERROR) QCOUNT 1 ACOUNT 0 NSCOUNT 0 ARCOUNT 1 QUESTION SECTION: Offset = 0x000c, RR count = 0 Name "(7)smtpgw1(3)gov(2)on(2)ca(0)" QTYPE AAAA (28) QCLASS 1 ANSWER SECTION: empty AUTHORITY SECTION: empty ADDITIONAL SECTION: Offset = 0x0023, RR count = 0 Name "(0)" TYPE OPT (41) CLASS 4096 TTL 0 DLEN 0 DATA Buffer Size = 4096 Rcode Ext = 0 Rcode Full = 0 Version = 0 Flags = 0 

Second packet - query from my DNS server to their DNS server

 10/16/2015 9:42:29 AM 0974 PACKET 000000EFF0A22160 UDP Snd 204.41.8.237 3e6c Q [0000 NOERROR] AAAA (7)smtpgw1(3)gov(2)on(2)ca(0) UDP question info at 000000EFF0A22160 Socket = 9812 Remote addr 204.41.8.237, port 53 Time Query=0, Queued=0, Expire=0 Buf length = 0x0fa0 (4000) Msg length = 0x0023 (35) Message: XID 0x3e6c Flags 0x0000 QR 0 (QUESTION) OPCODE 0 (QUERY) AA 0 TC 0 RD 0 RA 0 Z 0 CD 0 AD 0 RCODE 0 (NOERROR) QCOUNT 1 ACOUNT 0 NSCOUNT 0 ARCOUNT 0 QUESTION SECTION: Offset = 0x000c, RR count = 0 Name "(7)smtpgw1(3)gov(2)on(2)ca(0)" QTYPE AAAA (28) QCLASS 1 ANSWER SECTION: empty AUTHORITY SECTION: empty ADDITIONAL SECTION: empty 

Third packet - response from their DNS server (NOERROR)

 10/16/2015 9:42:29 AM 0974 PACKET 000000EFF2188100 UDP Rcv 204.41.8.237 3e6c R Q [0084 A NOERROR] AAAA (7)smtpgw1(3)gov(2)on(2)ca(0) UDP response info at 000000EFF2188100 Socket = 9812 Remote addr 204.41.8.237, port 53 Time Query=4556080, Queued=0, Expire=0 Buf length = 0x0fa0 (4000) Msg length = 0x0023 (35) Message: XID 0x3e6c Flags 0x8400 QR 1 (RESPONSE) OPCODE 0 (QUERY) AA 1 TC 0 RD 0 RA 0 Z 0 CD 0 AD 0 RCODE 0 (NOERROR) QCOUNT 1 ACOUNT 0 NSCOUNT 0 ARCOUNT 0 QUESTION SECTION: Offset = 0x000c, RR count = 0 Name "(7)smtpgw1(3)gov(2)on(2)ca(0)" QTYPE AAAA (28) QCLASS 1 ANSWER SECTION: empty AUTHORITY SECTION: empty ADDITIONAL SECTION: empty 

Fourth packet - response from my DNS server to client (SERVFAIL)

 10/16/2015 9:42:29 AM 0974 PACKET 000000EFF1BF01A0 UDP Snd 172.16.0.254 a61e R Q [8281 DR SERVFAIL] AAAA (7)smtpgw1(3)gov(2)on(2)ca(0) UDP response info at 000000EFF1BF01A0 Socket = 508 Remote addr 172.16.0.254, port 50764 Time Query=4556080, Queued=4556080, Expire=4556083 Buf length = 0x0fa0 (4000) Msg length = 0x002e (46) Message: XID 0xa61e Flags 0x8182 QR 1 (RESPONSE) OPCODE 0 (QUERY) AA 0 TC 0 RD 1 RA 1 Z 0 CD 0 AD 0 RCODE 2 (SERVFAIL) QCOUNT 1 ACOUNT 0 NSCOUNT 0 ARCOUNT 1 QUESTION SECTION: Offset = 0x000c, RR count = 0 Name "(7)smtpgw1(3)gov(2)on(2)ca(0)" QTYPE AAAA (28) QCLASS 1 ANSWER SECTION: empty AUTHORITY SECTION: empty ADDITIONAL SECTION: Offset = 0x0023, RR count = 0 Name "(0)" TYPE OPT (41) CLASS 4000 TTL 0 DLEN 0 DATA Buffer Size = 4000 Rcode Ext = 0 Rcode Full = 2 Version = 0 Flags = 0 

Other things of note:

  • One of the networks has native IPv6 internet access, the other does not (but IPv6 stack is enabled on the servers with default settings). Doesn't seem to be an IPv6 network issue
  • It doesn't affect all domains. For example dig @192.168.5.5 -t AAAA serverfault.com returns NOERROR, and no results. Same thing for google.com returns google's IPv6 addresses properly.
  • Tried installing hotfix from KB3014171, made no difference.
  • The update from KB3004539 is already installed.

Edit Nov 7, 2015

I've setup another non-domain joined Server 2012R2 machine, and installed DNS server role, and tested with the command nslookup -type=aaaa smtpgw1.gov.on.ca localhost. It does NOT have the same issues.

Both VMs are on the same host, and same network, so that eliminates any network/firewall issues. It's now down to either patch level or being a domain member/domain controller that makes the difference.

Edit Nov 8, 2015

Applied all updates, made no difference. Went through to double check if there were any configuration differences between my new test server and my domain controller's DNS settings, and there are - the domain controller had forwarders setup.

Now, I'm sure I tried with forwarders and without in my initial tests, but I only tried it using dig from a linux machine. I do get slightly different results with and without forwarders setup (tried with Google, OpenDNS, 4.2.2.1, and my ISP DNS servers) when I use nslookup on a windows machine.

With a forwarder set, I get Server failed.

Without a forwarder (so it uses root DNS servers), I get No IPv6 address (AAAA) records available for smtpgw1.gov.on.ca.

But that's still not the same as what I get for other domains that don't have IPv6 records - nslookup on windows just returns no results for other domains.

With or without forwarders, dig still shows SERVFAIL for that name when querying my windows DNS server.

There IS a small difference between the problem domain and other ones that seems relevant, even when I don't involve my windows DNS server:

dig -t aaaa @8.8.8.8 smtpgw1.gov.on.ca has no answers, and does not have an authority section.

dig -t aaaa @8.8.8.8 serverfault.com returns no answers, but does have an authority section. So do most other domains I try, no matter what resolver I use.

So why is that authority section missing, and why does Windows DNS server treat it as a failure when other DNS servers don't?

25
  • Are you performing these tests from the Exchange server? If not, I would suggest doing that so that you can see it from Exchange's perspective. You might want to try running SMTPDiag from the Exchange server as well. I'd suggest running it while performing a network capture on the Exchange server so that you can view the details of the network/DNS activity. SMTPDiag is an old tool, but it's a command line tool that doesn't require any installation, so I'm thinking that it should work on all versions of Exchange. - microsoft.com/en-us/download/details.aspx?id=11393 Commented Oct 15, 2015 at 22:44
  • Some network devices don't recognize and will reject EDNS packets. Did your network team introduce new device/setting recently? To eliminate this possibility, try to resolve google.com's AAAA record, it should return an IPv6 address. Commented Oct 16, 2015 at 0:05
  • @strongline EDNS packets come through fine. AAAA record for google works, as do a couple other sites I know have IPv6 running. Only chance made recently was getting rid of our last Server 2008R2 DC/DNS server and replacing with 2012R2. Commented Oct 16, 2015 at 0:10
  • Is IPv6 disabled in any way in your environment? Commented Oct 16, 2015 at 0:23
  • @JimB neither really enabled nor disabled...IPv6 stack is running on the servers, because it's on by default, with whatever default configuration it has. Gateway and internet connection have no IPv6 whatsoever. Commented Oct 16, 2015 at 0:31

2 Answers 2

3

I've looked into the network tace some more and done some reading. The reqest for the AAAA record, when non-existant, returns an SOA. Turns out the SOA is for a different domain that that being requested. I suspect that's why Windows is rejecting the response. Request AAAA for mx.atomwide.com. Response SOA for lgfl.org.uk. I will see if we can make some progress with this information. EDIT: Just for future reference, temporarily turning off "Secure cache against pollution" will allow the query to succeed. Not ideal, but proves the issue is with a dodgy DNS record. RFC4074 is also a good referemce - Intro and Section.

2
  • I am going to try to test this today in my environment, but I think you may be onto something! Commented Mar 18, 2016 at 12:13
  • Also I have edited out your link - signatures and off topic links are not allowed here, and I don't want to see your otherwise excellent answer get deleted for it. Commented Mar 18, 2016 at 12:19
0

According to KB832223

Cause

This issue occurs because of the Extension Mechanisms for DNS (EDNS0) functionality that is supported in Windows Server DNS.

EDNS0 allows larger User Datagram Protocol (UDP) packet sizes. However, some firewall programs may not allow UDP packets that are larger than 512 bytes. Therefore, these DNS packets may be blocked by the firewall.

Microsoft has the following resolution:

Resolution

To resolve this issue, update the firewall program to recognize and allow UDP packets that are larger than 512 bytes. For more information about how to do this, contact the manufacturer of your firewall program.

Microsoft has the following suggestion to work around the issue:

Workaround

To work around this issue, turn off the EDNS0 feature on Windows-based DNS servers. To do this, take the following action:

At a command prompt, type the following command, and then press Enter:

dnscmd /config /enableednsprobes 0

Note Type a 0 (zero) and not the letter "O" after "enableednsprobes" in this command.

2
  • I have seen this article - the firewalls I have tested with both pass large dns packetd without issue, as evidenced by it working perfectly on linux. Disabling edns prevents the use of DNSSEC, so though it fixes the problem it is not a good solution. Commented Feb 29, 2016 at 22:09
  • sorry I didn't realize that Microsoft's guidance would apply to Linux also. Out of curiosity, do you have any Microsoft OS that is working through the firewall? Commented Feb 29, 2016 at 23:36

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.