fly.io site is currently inaccessible...

Our web service is down
When will it be back to normal?..

1 Like

Ours is down as well

1 Like

Our IAD servers are out as well, along with the machine API.

1 Like

fly.io itself, our machines in several different regions, the cli.

this isn’t good.

1 Like

We had managed to make it through the current incident unscathed until a few minutes ago but it appears to be worsening. Many (but not all!) of our apps in iad are reporting no machines.

1 Like

Major infrastructure issue. Our API has been down for over 2 hours.

2 Likes

Yeah they say “degraded API performance” but I can’t get any of my API calls to work:

| Error: server returned a non-200 status code: 504

And cannot even load fly.io anymore.

1 Like

Has there been any other communication other than the status page updates?

1 Like

Not that I’m aware, and I don’t think we should expect it for a hot minute—lots of signal to indicate this has ballooned (see what I did there) into a widespread, likely even global, outage.

At least the status page and Discourse sites are up :smile:

#hugops

2 Likes

our apps appear to be up again, cli is working, fly.io is accessible.

2 Likes

Things seem to be getting better, though I still have no CLI access at all which is making it really hard to restore our services :frowning:

#hugops for sure, this musta gotten way bigger than they excpected

1 Like

Still can’t redeploy via CI.

1 Like

Yep, we’re in the same boat. Deploys are still 504-ing (not Depot), and all attempts to roll existing instances are also 504-ing. Not out of the woods yet.

1 Like

Down for me as well as of 7:59 MST. 504’ing after waiting for the depot. Sucks as I had just gotten a solution ready to test apparently right as it went down. Always how it goes lol

1 Like

@bobbyhiddn
Has similar incidents happened before? In my view, an outage lasting several hours is a very serious incident. If such things happen frequently, we may need to seriously consider migrating away from Fly.io. I really like the convenience that Fly.io brings, but stability is always the highest priority as we operate in the financial payment industry.

1 Like

So far, no. I’ve been using it for a few months now and the convenience has been a huge value add as it let’s me black box most of the deployment stream while testing. This has been my first major incident with the platform. So far, none of my products are making money, just some development ideas, so it’s not a huge deal for me, but if they were, I would be concerned.

1 Like

I am surprised they haven’t bothered to comment here, though.

1 Like

We’ve been here for just about a year and a half. For sure not the first major outage. This is however one of the longest-lasting ones that I’ve personally seen. I think I’ve experienced about 4-5 other large outages with Fly, most lasting less than an hour with only one that I can remember lasting more than 2. This is by far the worst one I’ve experienced and is causing a lot of issues on our end. Definitely feel for the team, scaling server infra from scratch like this is a massive undertaking.

1 Like

Take it as a sign that they’re all-in on trying to fix it. I’m sure they will reply once they aren’t all hands on deck :slight_smile:

1 Like

We are 30 minutes away from our school project presentation, and we are very flustered.

1 Like