fly.io site is currently inaccessible..

This is absolutely frustrating! It’s been over 2 hours, and our application is completely down. The system keeps returning 504 Gateway Timeout errors. We can’t scale machines, and deployments are entirely non-functional. The flyctl command is also completely unresponsive, as if it’s “dead.”

This situation is severely impacting our business. We urgently need assistance from the Fly team to resolve this issue immediately.

Extremely disappointed with the current state of service!

2 Likes

+1
still facing 10+ upset VIP customers

1 Like

I love some of the cool features of Fly.io but this has been one to many production outages. I’ll use it for dev / low impact apps going forward but I gotta move my production load off of here after this.

2 Likes

Deploy/Image API calls not working. 2 apps, one cannot load machines in the UI, the other, I cannot stop the machine. 1 app has been in ‘Deploying’ state for a couple/few hours. During this time, my ‘Current month so far’ $$ number has ticked up albeit not a big number.

Need to refresh images and I can see the ‘logs’ API working via CLI (Grafana isn’t working for me yet). The apps are getting traffic but I can’t do anything about it (stop/refresh/etc.).

1 Like

same. so upsetting and frustrating.
We’ll move on to another Cloud after this…

1 Like

What’s infuriating to me is the status page calling this “degraded API performance”, how is a major outage like this, when many of our sites were completely offline, simply classified as “degraded API performance”?

2 Likes

What I don’t like: last 2 updates on the incident have quite optimistic verbiage. Last one was maybe an hour ago and I’m still in the same unusable state as my morning (APAC region).

1 Like

Same issue still with Error: server returned a non-200 status code: 504.

Honestly surprising that even when paying for the service there is no way to contact support without paying at a minimum an extra $30 a month…for only help during business hours

1 Like

At this point, it really is starting to get a bit ridiculous

4 Likes

Right there with you.

Incidents are tough, no doubt, and folks deserve space and our grace to tame fires. That said, this wall of opacity and status page marketing spin ain’t it. Just sitting on our hands over here smashing return on flyctl and spamming the refresh button waiting for any signal that things are going to change.

1 Like

I agree, I appreciate that they are likely all working on trying to get this solved, but not having any meaningful updates, and the fact that the few updates we have gotten seem to not actually be factual, makes this a really tough situation.

2 Likes

Systems issues I can tolerate, it’s the lack of communication and transparency that’s bothersome. Just hire 1 community manager to be active on this discourse and discord man…

2 Likes

Can’t build! I’m getting this, “Error: failed to fetch an image or build from source: failed to list volumes: context deadline exceeded”. It 504s other times.

What region is everybody in? My machine is in the Illinois location I think.

1 Like

Yeah, just about anything related to deploy/cli/api is currently down.

2 Likes

Singapore & Sydney

2 Likes

Still experiencing the issue. Our prod app has been unusable.
Anyone has update?

Region: NRT (Tokyo)

1 Like

Unfortunately no real updates beyond their status page, which says they are scaling up their systems to handle the increased load (presumably of everyone trying to get their apps back online)

1 Like

It’s not just deploys that are failing - we lean heavily on FLAME and basically every call fails with a 503 Service Unavailable response because instances cannot be started. It’s been like this now for ~9 hours, basically our entire business day.

Luckily I was able to patch around the problem early on, but since then I’ve been unable to fly ssh into our instances either. Not Good.

1 Like

Yeah, seems like a LOT of stuff broke :sweat_smile: I also need to SSH into a machine atm and have been unable to for the last ~5 hours

1 Like

the same here. Machines have been unresponsive for over 8 hours now and I can’t deploy or scale to another region :frowning_face:

1 Like