cannot unmarshal DNS message

Just today I found this error in the logs:

Get "https://api.mps.ford.com/api/users/vehicles": dial tcp: lookup api.mps.ford.com on [fdaa::3]:53: cannot unmarshal DNS message

The unmarshaling is part of the Go standard library, same request works locally. No specific DNS server settings applied. I could probably try to force the app to use 8.8.8.8 or something, but it seems this should fixed or investigated on fly side?

Here’s a little more detail:

app[3fbb140b] fra [info] Preparing to run: `dns-test` as root
app[3fbb140b] fra [info] 2021/10/18 11:52:06 listening on [fdaa:0:30b1:a7b:23c3:3fbb:140b:2]:22 (DNS: [fdaa::3]:53)
app[3fbb140b] fra [info] lookup api.mps.ford.com on [fdaa::3]:53: no such host
app[3fbb140b] fra [info] resolver [65.52.150.178]

The default [fdaa::3] DNS resolver is not able to resolve api.mps.ford.com while 1.1.1.1 is, so is my local DNS.

As workaround I could switch the Golang default DNS resolver, but I’d prefer not to if I don’t have to.

Trying to reproduce this now, will report!

Ok, I’ve reproduced this with net.LookupHost on my own instance with your hostname. I can un-reproduce it by switching to an older version of our DNS server, so I should be able to track down the issue quickly.

1 Like

I haven’t so much tracked down the bug as tracked down the vicinity of the bug: in the new server, when we forward requests, we round-trip them through miekg/dns (because the code path handling the request is already dealing with miekg/dns.Request objects), meaning we’re parsing and then re-marshaling the responses from our recurser.

I’ve changed the code so that we’re just forwarding raw requests now, without parsing and re-marshaling, and my reproduction of this problem now seems to work fine. By the time you read this, it should be “fixed” for your app; let me know if it isn’t!

Thanks for catching this.

1 Like

Working again, thank you. I’d suspect that the re-marshaling is broken then. I had a quick look at miekg/dns but it’s beyond a quick test to see what might actually be wrong.

Thank you!

Seeing a new flavor of DNS error just now:

dial tcp: lookup xxx on [fdaa::3]:53: read udp [fdaa:0:30b1:a7b:23c5:0:848c:2]:39014->[fdaa::3]:53: read: connection refused (MSE21)