More control over your app restarts

TLDR: You can now control restart behavior in fly.toml, for when an app’s machines exit or crash. See the docs for details.


When a machine unexpectedly crashes, our scheduling machinery follows a machine’s restart policy to restart it without your intervention. These policies quite similar to docker’s container restart policies:

  • always: we’ll attempt to restart the machine no matter the exit code.
  • never: we won’t restart the machine even if it exits with a non-zero exit code.
  • on-failure: we’ll only restart the machine if it exited with a non-zero exit code (due to a failure or crash).

You can also specify the number of times we should retry restarting before giving up.

Now, this is not technically a new machines feature. Machines created with fly machines were given a default always restart policy. Machines created with fly deploy are given an on-fail with 10 retries. fly machines create/updated also has a --restart flag that accepts a valid restart policy.

To try this out, add the restart section to your fly.toml and runfly deploy:

app = "app-name"
primary_region = "jnb"

[[restart]]
  policy = "never"
  retries = 10
  processes = ["app"]

You might have already noticed that a restart policy can be targeted to a specific process group. If a group is not specified, all machines in an app will have the same default restart policy.

Happy restarting!

6 Likes

Thanks for this. Two things:

  • Docs related to how to setup this in fly.toml is not updated.
  • Based in your words, I expected applied policy to all my process by default:

[processes]
  zone1 = "node bin/cron zone1"
  zone2 = "node bin/cron zone2"
  zone3 = "node bin/cron zone3"

[[restart]]
  policy = "always"

but not really, It doesn’t work unless I explicitly opt-in for it:

app = "teslahunt-api"
primary_region = "iad"
kill_signal = "SIGTERM"
kill_timeout = 60

[processes]
  zone1 = "node bin/cron zone1"
  zone2 = "node bin/cron zone2"
  zone3 = "node bin/cron zone3"

[[restart]]
  policy = "always"
  processes = ["zone1", "zone2", "zone3"]
1 Like