Healthchecks and private networks

Oh great question. This is something we’ve been thinking about, but we don’t have a great answer. We do have a workaround, though, and it’s a script check. What I’d probably do is put a script check on your public port like this:

  [[services.script_checks]]
    interval = 10000
    timeout = 1000
    command = "/fly/check-nodes.sh"
    restart_limit = 0

That command could just be a curl, or a more sophisticated script.

Instances restart if there are 5 failures within 25 seconds. You can control the count with restart_limit. If you set it to 3, for example, it’ll restart after 3 failures within 15 seconds.

:fireworks: :fireworks:

2 Likes