Should dead apps recover automatically?

I’ve discovered today that one of my apps (1 instance only, in FRA) has been down since yesterday morning, with the status marked as “dead” and no logs loading. Other people appear to also have problems with logs not loading, and the status page shows some issues with Consul in FRA during the time my app died, although the app doesn’t use Postgres. Restarting the app brought everything back up.

However, I assumed that Fly.io would automatically replace instances if the monitoring detects that they are unhealthy. Is this not the case? Do I need some specific settings to enable this behavior? Or was this an issue with the platform similar to this one and the app was actually supposed to recover without intervention?

Hi!

Someone who knows more specifics can correct me here, but I believe an instance is only replaced if there is an issue in the underlying host.

If an instance fails to start or exits with an error, it should get restarted (not replaced!) a few times. It’s possible a transient issue caused the app to fail too many times in a row.

(This is a bit of a guess without more information - perhaps your fly.toml and the name of your app would be useful to share).