Our app started seeing issues last night. We use Cloudflare in front of Fly, and it’s been working with problems until now.
We’re seeing intermittent Origin Is Unreachable 523 errors. The error will display, and a refresh will cause the page to load to correctly. Unfortunately this error is impacting a large portion of our traffic.
I inspected the Grafana metrics for Fly Edge on our app, and noticed the following anomaly:
TLS handshakes have been relatively constant across regions over the last 24 hours, except for Chicago. The first image below is all regions (no obvious drop), but the second image has handshakes filtered for just Chicago. Notice the large drop at 4AM.
I see the same drop when filtering by “Data In / Out” for the Chicago edge:
Anyone else experiencing this / have any idea on what I can do to fix the problem? Would really like to get this sorted ASAP as we’re losing quite a bit of traffic and I’m at a loss for what is causing the issue.