Service unreachable only to some clients

Earlier today we had an issue where one of our websites was completely unreachable from the browser. In a panic we quickly restarted the instance and the service was again available, but looking back it seems like a couple strange things were going on:

  1. The fly logs showed nothing abnormal. There had been no log activity for about 17 hours (this instance just serves www content so to be expected). As far as fly was concerned the instance was running fine at that time.
  2. We have health checks internally as well as from updown.io (which are global) and nothing was triggered, all showed perfect uptime.

Unfortunately we don’t have any evidence to present other than a safari screenshot showing a webpage not loading, but is there anything we could have done wrong to cause this? I won’t pretend I fully understand how Anycast works, so I don’t know how to assess the possibilities there, whether it’s the fault of my ISP or what.

And no, it wasn’t just the internet was down, other websites were loading just fine.

Let me know what details I can give to help evaluate.

I can think of 2 things:

  1. HTTP/3. Do not enable it if you’re running multiple web-server instances on Fly. Fly’s UDP-proxy isn’t QUIC-aware: Regional/Node UDP ranges to alleviate IPv4 limits - #4 by ignoramous
  2. DualStack. Was the client using IPv6? Not sure if Fly has fixed all its IPv6 woes: Unable to access fly.io web and API over IPv6

Thanks for your recommendations!