Local DNS Cache and NXDOMAIN Responses

    • We're rolling out automated installs to subdomains at work and finding that we're getting cached responses from local nameservers. E.g. we create bob.example.com, but the local dns cache is returning NXDOMAIN on bob.example.com (192.168.1.1) even though an nslookup on the specific nameserver yields the correct domain.

      What kind of solutions do we have for this problem.

    Answers(26)

    • rollingrazor.com - its been up for over a year on that DNS provider. the oddest behavior i saw earlier today was on DNS server 68.238.64.12 (local ISP) performing multiple lookups. i got NXDOMAIN then the correct CNAME resolution and then NXDOMAIN again within 10 seconds. problem fixes itself and then several hours later resolves incorrectly again – Simon Dec 4 '09 at 2:35

    • I suspect what might be happening is chrome might be pre-fetching urls it finds on the page - we place bob.example.com in a hidden div when the "thank you" page loads. – bundini Jul 15 '11 at 15:12

    • Chromium indeed does that very thing. – JdeBP Jul 22 '11 at 16:39

      • Replying to my own answer, but you might find the conntrack tools ( netfilter.org/projects/conntrack-tools/index.html ) useful for viewing and managing the conntrack tables. – Andy Smith Feb 8 '10 at 15:45

          • I'd highly recommend using dig to do the querying, rather than nslookup. That aside, I've found enough complaints out there to make me think that GoDaddy has load-balancers on their DNS server such that you won't be able to find the errant DNS servers as they're hidden behind the load-balancers. – Evan Anderson Dec 4 '09 at 3:20

          • Ah, fair enough... my mistake :-) – Andy Smith Feb 15 '10 at 18:30

              • So most likely the GoDaddy name servers haven't replicated the domain properly amongst themselves. What is the domain? – Sim Dec 4 '09 at 2:31

              • They will time out eventually. Before making changes you should reduce your negative TTL to a reasonable amount. The value is specified in your SOA record for the domain. Query your servers for the SOA record to determine how long the timeout might last.

                Default value is documented as 3 hours, and maximum value is 7 days.

                As you have found, it is not a good idea to query your local servers for new services before you know they are available on all your authoritative nameservers. Doing so may prime the cache with a negative answer. Query them first to verify.

              • Jeff Atwood and the Stack Overflow crew didn't like GoDaddy's DNS service much: blog.stackoverflow.com/2009/09/new-dns-provider Personally, I use the provider that Jeff went with, Dynamic Network Services: dynamicnetworkservices.com – Evan Anderson Dec 4 '09 at 2:36

              • I suspect what might be happening is chrome might be pre-fetching urls it finds on the page - we place bob.example.com in a hidden div when the "thank you" page loads. – bundini Jul 15 '11 at 15:12

                • Chromium indeed does that very thing. – JdeBP Jul 22 '11 at 16:39

                • Don't think dig is available for Windows. – fpmurphy1 Dec 4 '09 at 4:39

                • TCP is used when the query is over 512 bytes in size, which as you say is usually for zone transfers and the like, but you'll sometimes see legitimate clients doing this too.

                  The reason you're seeing a lot of connections from udp/53 is because a lot of nameservers are configured to respond on that port, as opposed to a random high port. If I was you, I'd allow udp/53 from the upstream DNS servers and leave it at that.

                • it's not "connection" tracking but certainly there is a "conversation" that could be tracked. – Dan Pritts Dec 7 '12 at 4:29

                • Thanks Alnitak for your clear explanation. That's what I'll do. – JFA Feb 9 '10 at 11:34

                • DNS results are also cached by Windows and may impact reachability. Try ping bob.example.com from the command line. Browsers cache pages results unless there is an appropriate no-cache directive in the headers. Usually a forced reload will look for changes and reload changed content. If not, restarting the browser should do so. The browser should have configuration items controlling when changes are queried for. – BillThor Jul 15 '11 at 17:02

                    • Connection tracking and UDP doesn't make any sense. There is no connection to track as UDP is a connectionless protocol.

                      To be fully RFC compliant, you must listen on port 53 for both tcp and udp traffic.

                      http://tools.ietf.org/html/rfc5966

                    • we're using godaddy's DNS as part of a dedicated server plan. whether its FREE or not i guess is irrelevant from what you're saying. if they really are returning NXDOMAIN then they are the weakest link and changing providers will fix the issue. we've only seen this in last 48 hours though and not before - and its real users seeing it. we have constant traffic coming to the site so caching rate should be very high. i'm not quite clear what you mean about recursive DNS. theres obviously some kind of chain of caching - my PC, my router, verizon DNS, etc.. should i raise my TTL ? – Simon Dec 4 '09 at 2:27

                      • Actually, TCP is used when the packet is truncated . With the use of EDNS0 (RFC 2671) that is not always at 512 bytes. – Alnitak Feb 9 '10 at 12:55

                      • They will time out eventually. Before making changes you should reduce your negative TTL to a reasonable amount. The value is specified in your SOA record for the domain. Query your servers for the SOA record to determine how long the timeout might last.

                        Default value is documented as 3 hours, and maximum value is 7 days.

                        As you have found, it is not a good idea to query your local servers for new services before you know they are available on all your authoritative nameservers. Doing so may prime the cache with a negative answer. Query them first to verify.

                          • DNS results are also cached by Windows and may impact reachability. Try ping bob.example.com from the command line. Browsers cache pages results unless there is an appropriate no-cache directive in the headers. Usually a forced reload will look for changes and reload changed content. If not, restarting the browser should do so. The browser should have configuration items controlling when changes are queried for. – BillThor Jul 15 '11 at 17:02

                          • I think you may have a conceptual issue with how DNS works.

                            Only DNS servers performing recursive resolution cache lookups. The DNS servers that the affected users on "verizon DSL, comcast cable, verizon EVDO, site24x7 website" are using are the ones caching lookups.

                            The root DNS servers, .com servers, and the servers authoritative for your domain aren't caching lookups, because they're not providing recursive resolution service.

                            It's possible (likely, actually, from what I'm seeing in Google searches) that GoDaddy is sporadically returning NXDOMAINs for your domain, and those NXDOMAINs are being cached by recursive resolvers. (Per RFC2308, they should be cached, at most, either the TTL for the zone as specified in the SOA, or the SOA minimum-- whichever is least.)

                            Apparently, GoDaddy's "free" DNS service isn't too highly regarded. I don't use it, personally, so I can't comment on it.

                              There is no central "list" of DNS servers providing recursive resolution for you to "test against". (I have one here in my house, and I could spin a few more up on VMs if I needed to...) You need a reliable provider to be authoritative for your domain, and you just have to hope that everybody else in the world honors TTLs and acts as "good DNS citizens".


                              Edit:

                              "Recursive resolution" is the process by which the a DNS server resolves a record for which it is not authoritative. The process starts with the root DNS servers, and proceeds recursively (that is, a process that loops back on itself) through all the authoritative DNS servers for the domains specified in the query until the last DNS server is reached and the desired resource record (or a negative response) is returned.

                              For a three-level query, like "www.example.com", the following occurs (I am leaving out the fact that, all along the way, the ISP DNS server is checking its cache in lieu of issuing queries to remote DNS servers and putting the results it receives into its cache, to make this clearer and a bit more simplistic):

                              • Your PC issues a query to your specified DNS server (at your ISP, for example).

                                • The ISP DNS server verifies that it doesn't have a response in cache, and then queries one of the root DNS servers.

                                • The root DNS server, only being authoritative for the root, responds with a list of DNS servers authoritative for the gTLD specified in the query (.com, .net, .tv, .fu, etc). The protocol continues as such, w/ the full query always being sent to each successive DNS server throughout this process. Since it's not possible to know which DNS server will be authoritative for any given query and we want to minimize the number of round-trips, we always send the full domain in each query.

                                  • The ISP DNS server queries one of the DNS servers returned as authoritative for the gTLD specified.

                                  • The gTLD DNS server, being authoritative for the second-level domain (example, microsoft.com, example.com, etc) only, responds with a list of DNS servers authoritative for the second-level domain.

                                  • The ISP DNS server queries one of the DNS servers returned as authoritative for the second-level domain.

                                  • The DNS server authoritative for the second-level, being for the third-level domain (www.microsoft.com, ftp.example.com, etc), domain returns the record requested.

                                    • The ISP DNS server returns the record your PC queried back to your PC.

                                    Typically ISPs offer recursive resolution services to their Customers. The DNS servers at hosting providers that are authoritative for Customer hosted domains generally don't provide recursive service (and will return the root servers if queried for domains they aren't authoritative for).

                                  • IMHO, you should specifically permit egress DNS queries on your OUTPUT chain (on both UDP and TCP, of course), and then drop the port- and protocol- specific flags from the RELATED ingress rule, e.g.:

                                    iptables -A INPUT -m state --state ESTABLISHED,RELATED -j ACCEPT
                                    

                                    Or, in other words, use egress policy to control outbound traffic on a per-protocol basis, and stateful matching to control the inbound responses.

                                    The default RHEL/CentOS iptables rules work like this, albeit they default to permit any OUTPUT packet.

                                    And yes, you will quite often see rejected packets because they were too late to match the stateful matching.

                                    • @fpmurphy dig is available for windows try a google search. It is certainly included as part of the BIND for windows software but requires a resolv.conf file to be created and a number of BIND dlls to be available. – Sim Dec 4 '09 at 5:10

                                      • You really need to address the root of the problem. This is what I would do:

                                        1. Perform a whois query for the domain in question.

                                        2. Write down the name servers listed for the domain.

                                        3. Perform an NS nslookup against each of the name servers listed in whois and make sure they return the same list of name servers that whois listed.

                                        4. Query each name server from step 2 for the domain in question and make sure they all return the correct info. If any of the name servers return an NXDOMAIN response then you've found the culprit.

                                        Any name servers that are listed in whois that aren't listed when you query the name servers individually need to be removed from whois.

                                        Conversely, any name servers returned from your NS nslookup that aren't listed in whois need to be removed as name servers.

                                      • Not to put too fine a point on it but the servers that are authorative for the root domain (.) are technically the only "root" servers (a.root-servers.net, b.root-servers.net, etc.). The servers responsible for the gTLD's (.com, .edu, etc.) are not technically root servers as they don't exist at the root. These servers (a.gtld-servers.net, b.gtld-servers.net, etc.) exist one level below the root servers in the DNS hierarchy. – joeqwerty Dec 4 '09 at 4:13