Yep, I see the errors, including a couple from today. I’ll dig in and see what orchestration events happened at the same time.
I’m monitoring these failures in general; we have a sort of low-rumbling concern that we’re seeing more internal DNS errors recently, but the level is pretty consistent (we get a lot of DNS failures from recurring lookups for the wrong name, which dominate the metric). But yours is a smoking gun. Thanks!
Just want to chim in and say that we are observing the exact same issue and behavior. I can reproduce the exact same steps as @jamesbirtles when this happens.
I’d say that we have way more issues with the FRA region than any other.
We believe we’ve isolated this problem to a particular pair host worker hosts in our network that somehow briefly had colliding IP addresses in our WireGuard mesh.