Explain, how high availability works

Suppose, I want high availability and deploy 2 machines.

Won’t the deployment of the next version of my software happen at the same time on both machines, so breaking high availability?

Generally, how does high availability work?

The idea of high availability is essentially that if you have 2 (or more) Machines, both with the same purpose (both acting as a web server or a database node or a queue worker, etc.), then if one of the Machines dies for some reason, you still have the other one there to continue doing as much of the work as it’s able.

If you have a “highly available” web server, this would mean that e.g. visitors to your website are still able to load it, because that 1 Machine is still serving requests. If your server was not highly available (only 1 Machine), then if that Machine dies, your web server is returning errors to all your visitors.

There are some nuances depending on the particular app concerned (e.g. in this web server example, being down 1 Machine does mean you can serve less traffic, but you’re at least still serving some), but that’s the general idea.

But how about the time of inactivity of machines, when I run a deployment (such as fly deploy)? Will it be automatically ensured that two machines are not updated simultaneously?

you can choose a deployment strategy for your apps:

  • immediate: what you were thinking about in OP - blindly update all machines at the same time
  • rolling: update Machines one at a time, waiting for each Machine to be healthy before updating the next one
  • canary: create one new machine that does not serve traffic, ensure it is healthy, and then proceed with the rolling strategy
  • bluegreen: create an entirely new set of machines, and once they are all healthy, move traffic over to the new set of machines all at the same time

(docs)

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.