Two major outages in two days, and yet no status updates?

I have two public-facing App running on Fly.io (web apps + REST APIs for mobile apps).

Yesterday, one started going down, answering fewer and fewer calls. Then the second went down the same way, for about 30 minutes. Meanwhile, the Fly.io dashboard was just as unstable as my Apps. For a while, restarting machines from the dashboard (when I could reach the Machines page) seemed to help a bit.

All of a sudden, it all came back online. Good. However, no issues were ever mentioned on the status page or in an email. All I got was an email today about upcoming maintenance in CDG (where both my apps run one instance), which sounds like a crazy coincidence, but who knows.

The most concerning part is, the exact same scenario just happened again.

May I please have feedback about the two major outages happening in just two days (and affecting the Fly.io dashboard too, at least when reaching it from France!), to reassure me it won’t happen again tomorrow?

My apps run in the CDG and SJC regions. I have been using Fly.io for years and had never experienced anything like this.

PS: My App logs didn’t show any errors, there were simply much fewer logs all of a sudden as people stopped making requests because the servers were down. The only info I could get were client-side logs, stating “OS Error: Connection reset by peer” and associated to a port number that kept increasing with each error (e.g. 51225, then 51244).

2 Likes

I think we also experienced this.

Our instances are all in IAD and we saw 2 x 20 min outages, specifically for traffic coming from within the EU.

1 Like

Thanks for sharing @andykent.

The timestamps are a great match, though some Apps seem to have been impacted sooner (as I said earlier, my App with fewer users was impacted 10 min sooner, but they both resolved at the same time). My App with the most users was (mostly) down from 8:06pm GMT to 8:45pm GMT on Jan 17, then from about 9:00pm GMT to about 9:24 GMT on Jan 18.

Most of my traffic comes from the EU.

I think we have experienced it too :neutral_face:

1 Like

Dear Fly Team, it would be good to hear your position and plan to fix this in the future

2 Likes

Hey all, apologies for this.

Posted an update + consolidating the threads in Spotty service past few days? (esp. in Europe on weekends) - #12 by bglw

1 Like

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.