How to recover data and redeploy an app that has no host/volume redundancy

Hi Fly.io Community,

I’m facing a serious service interruption for my app muradi. Over 7 hours ago, this status message appeared:

“We are performing emergency maintenance on a host some of your app’s instances are running on. Apps may be unavailable until the maintenance is completed.”

Unfortunately, the issue is still ongoing, and the app remains completely inaccessible.

Here’s what I’ve tried so far:

  • Ran fly deploy – no change.
  • Attempted to scale/redeploy – still no resolution.
  • The app has an attached volume, which is currently tied to what seems to be a stuck machine on the affected host.

My Questions:

  • What is the current status of the maintenance?
  • Can I migrate my app and attached volume away from the affected host?
  • Is there any way Fly.io can help with restoring, detaching, or restarting the volume/machine?

This extended downtime is impacting my users, and I’m looking for a way to get the app back online as soon as possible.

Any help from the community or Fly.io staff would be greatly appreciated!

Thanks.

When i run: fly incidents hosts list
Host Issues count: 1

ID | MESSAGE | STARTED AT | LAST UPDATED
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
51w9vn5mjwo3qedj | We are performing emergency maintenance on a host some of your apps instances are running on. Apps may be unavailable until the maintenance is completed. | 2025-05-20 09:02:38 +0000 UTC | 2025-05-20 09:02:38 +0000 UTC
-------------------
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

When I try: fly deploy

I get the follwing error

Error: failed to update machine configuration for e286e732a9ed28 [app]: machine 'e286e732a9ed28' requires manual intervention, it can't be automatically replaced because its volume 'vol_4qp3wn8e2nn6w7wv' is on an unreachable host

Hi… If you have only a single volume, then the following page in the docs provides some general advice and caveats:

https://fly.io/docs/apps/trouble-host-unavailable/

It’s not automatic, since there is typically a serious risk of data loss, and only you can gauge the full trade-offs around that.

(On the other hand, if you do have multiple volumes, in a cluster, then it’s much easier—which is why Fly.io recommends ≥2 all the time.)

Hope this helps a little!

We have gone with single machine and single storage to not have multiple places to update multiple storage.

Now the data is very important How much time it might take so the emergency maintenance get’s completed?

In other hand is there a way to attach the current volume to a new machine and detach it from the machine which has issue.

These repairs can take a very long time—and may end with the volume being lost permanently. The multiple places to write data is exactly what gives you durability on this platform, although no one ever doubts the inconvenience of doing so!

My overall impression is that you really want to sign up for Fly Support, if you don’t have that already. (That’s what I would do, if I had irreplaceable data here on Fly.io.)

Contact them per the following:

https://fly.io/docs/about/support/#email-support

I do wonder if Fly’s host reliability level is not where I think it should be. That said, @mayailurus is quite right; running a cluster of hosts is the only way you will get a reliable storage service.

Your comment suggests to me that you’re expecting volume writes in a cluster to have to be done manually, which is perplexing; I would have thought you would just need some software to automate that.

Of course, as well as cluster syncing, you should have backups as well. Do you have them?

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.