Standby machine: what happens to the original machine when its host recovers after failover?

Hi… I’m curious about this, too, but, realistically, I think I’d go ahead and implement the “handle it [your]self” countermeasure, :sweat_smile:, regardless of what the answer is.

The standby feature is basically impossible for us ordinary users to test, which to me means that its behavior is never fully nailed down. Not enough so that it can be relied upon when data corruption is the risk of guessing/interpreting incorrectly, anyway.

(My understanding is that standby status was more intended for Rails-style heavyweight worker Machines, where the queue was enforced elsewhere and the only penalty for having two simultaneously was extra cost.)

LiteFS uses Consul leases, which are quasi-convenient for this kind of thing, since the Fly.io platform includes a multi-tenant(?) freebie cluster. (It still requires some scripting on your part, like you were already prepared for.)


Aside: As a general tip, it’s best to opt in to the Questions / Help category when you’re hoping that something will get a response from a Fly.io employee. That section of the forum has special status, as noted in the new sticky thread, whereas posts elsewhere are much more prone to falling through the cracks…

Aside2: I just added that category to this thread.