It's been 38hs and my instance is still experiencing an outage

santypk4 · September 26, 2023, 1:35pm

According to my status dashboard, the app has been in an outage for more than 38hs.

When will this issue be resolved?

Thank you.

gilbertc · September 26, 2023, 2:31pm

Same here, this is just awful. No reach out at all from the fly.io support team. No timeline on how long the fix will take.

nbutler · September 26, 2023, 3:14pm

Same here. It’s really annoying how vague the status interruption message is.

gilbertc · September 26, 2023, 4:00pm

I have a solution if you can do it with your setup.

## Scale down your mchn count to 0
fly scale count 0 

## WAIT FOR THE MCHNs to become destroyed on the fly io account portal

## Scale your mchn count back up
fly scale count 3

Yaeger · September 26, 2023, 7:00pm

Yea the “trick” is to just get that machine off that host, which you can do by eg scaling to zero and then back up. Or changing the region.

I don’t understand why these machines don’t get automatically moved though or why we don’t get a proper notification for these issues + guidelines for fixing it.

xeger · September 27, 2023, 2:29am

I’m experiencing the same thing; scaling down and then up seems to have mixed success.

Sam-Fly · September 27, 2023, 3:39am

Hey @santypk4 , as of a few hours ago this host looks like it should be back up, can you test connecting to your app again?

Hosts can experience issues and require maitenance for a number of reasons, and some fixes takes longer than others. For production apps (or any apps that require high availability) we strongly recommend running multiple instances to protect against single host outages like this one.

For apps without volumes, you can still add new machines when a host is down. By using the scale to 0 or region change methods mentioned above you can bring up a new machine on a different host.

For apps with volumes, like Postgres apps, you will need to run multiple machines if you need high availability as volumes are pinned to the specific host.

We’re working on adding better documentation for troubleshooting options in these cases.

gilbertc · September 27, 2023, 9:35pm

Hey @Sam-Fly, so let me get this straight. You want us to increase the amount of services that will give you more money when you constantly bork things on your side and provide absolutely no support to address the issue? That doesn’t seem like the greatest deal in the world to me. I was thinking about moving my apps over to fly because the idea behind it so ingenious. But the disaster recovery is absolutely below standard I have seen of any PaaS. I have read through the documentation on this forum and it appears fly is not production ready yet. I hope you guys can come to terms with these problems and fix them in the future but as far as i’m concerned it would not be worth even keeping this as a test environment. We are still spending money on the basic hobby plan and you’re down with no help for over a day without any access to contact you besides the forum (which you wait until others have solved the problem). Shame on fly.

system · October 4, 2023, 9:36pm

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Service Interruption message since 24+ hours ago Questions / Help	6	105	February 6, 2025
Anyone else having their apps go down?	24	354	September 20, 2024
Is ams region down?	5	309	October 3, 2023
Host Down on Newly Created App	8	60	September 19, 2024
Something went wrong? Questions / Help	42	1434	September 22, 2022

It's been 38hs and my instance is still experiencing an outage

Related topics