I realised today that an app I had in the MAD region wasn’t working.
I logged in and used the console to find out that 17 days ago:
Some of your apps in MAD region are on a host has suffered irreparable hardware damage. Please migrate your Fly Machines to other hosts and restore volumes from backups. You are not being charged for this resources on this host.
From other similar topics (here) from a similar incident in the past it seems that my data is gone. I was under the impression that snapshots were enough but that does not seem to be the case.
We view Fly Volumes as persistent storage, but not durable long-term storage; as we say in the docs :
Create and store backups: If you only have a single copy of your data on a single volume, and that drive fails, then the data is lost. Fly.io takes daily snapshots and retains them for 5 days, but the snapshots shouldn’t be your primary backup method.
But if that comes as a surprise to you, then we didn’t succeed in setting your expectations about the platform correctly, and that’s something we should be better about.
This is disappointing but ultimately my fault. However, I’m surprised that this was not communicated via email in any form? This seemed to be a major complaint around the time of that previous incident and apparently updated since.
But… don’t think it worked? Did I miss something?
Maybe if I was notified in time I’d have been able to use one of the automatic snapshots in that 5 day window to do something about it and therefore not lose the data?
Additionally, I can’t seem to scale down/up and therefore migrate the machine somewhere else. So my only option might be to fully delete the app and volumes and recreate everything from scratch?
Error: failed to grab app config from existing machines, error: could not create a fly.toml from any machines