fdaa::3 DNS server returning NXDOMAIN for A queries

When I run dig @fdaa::3 a myservice.internal, I get an NXDOMAIN response. dig aaaa myservice.internal gives me an IPv6 address, as expected.

Should the DNS server be returning NOERROR instead of NXDOMAIN for A queries? My understanding from some quick reading is that DNS servers should only return NXDOMAIN if there are no records for that resource. (like in this vulnerability)

I noticed this because I’m trying to set fdaa:3 as a resolver for nginx, and I think nginx is be interpreting the NXDOMAIN response for its A query as “this domain doesn’t exist, give up now” and so it’s failing to resolve the domain even though there’s an IPv6 record available.

1 Like

Wow this is a good catch. I think this caused me to write an excessive amount of bash.

We’ll get this changed today and see if it helps nginx domain resolution (which, honestly, I still don’t understand completely).

awesome! hopefully you’ll be able to retire that bash script too :slight_smile:

Being able to ssh in made it so much easier to debug this – this is what the DNS queries nginx was making looked like:

$ tcpdump -i any port 53
17:16:04.216161 IP6 fly-local-6pn.55356 > fdaa::3.53: 46219+ A? myservice.internal. (42)
17:16:04.216197 IP6 fly-local-6pn.55356 > fdaa::3.53: 11993+ AAAA? myservice.internal. (42)
17:16:04.216946 IP6 fdaa::3.53 > fly-local-6pn.55356: 46219 NXDomain- 0/0/0 (42)
17:16:04.217063 IP6 fly-local-6pn.43938 > fdaa::3.53: 32351+ PTR? 3.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.a.a.d.f.ip6.arpa. (90)
17:16:04.218378 IP6 fdaa::3.53 > fly-local-6pn.55356: 11993- 1/0/0 AAAA fdaa:0:bff:a7b:aa2:d426:1ab:2 (70)
17:16:04.461646 IP6 fdaa::3.53 > fly-local-6pn.43938: 32351 NXDomain 0/1/0 (154)

3 Likes

Huh. TIL. Nice catch! I’ve updated and tested the DNS server; it’s a high-drama ops morning here so it may be later in the day before it gets deployed fleetwide. Thanks for this!

Should be deployed now.

thank you! that fixed my nginx problems

2 Likes

Thank you! I’m pretty slapdash with .internal; for instance, for a few weeks I wasn’t even copying the query record to nxdomain/noerror responses, and dig would grok them and Golang would not. If you spot more things like this, I can fix them quickly.

(Of course, if you use fdaa::3 as your system DNS server, I’m proxying requests to a real DNS server, and the handling of those should not be slapdash. It’s important not to break “real” DNS, but .internal is something less than real.)