how do fly.io deploys work?

kurt · December 10, 2021, 2:01am

You’re probably hitting a slow service propagation issue we’ve been fighting. It will hopefully be fixed for good in a few weeks, but the tldr is we’re probably causing the downtime when you deploy an app with a single VM. You can reduce the impact of this by running fly scale count 2 or more.

When you deploy we:

Start a canary VM (if you’re using a volume, we skip this step)
Remove an old VM from service discovery
Wait 30 seconds
Stop old VM
Start new VM, add it to service discovery

fly-proxy relies on service discovery to know where to send requests. The problem is, it frequently takes 60-120s for fly-proxy to detect service discovery changes. So there’s a window when the old VM stops where fly-proxy doesn’t know about the new VM.

Running multiple VMs mitigates this by chance, it buys fly-proxy more time to detect the new VMs.

Topic		Replies	Views
fly scale vm causing downtime Questions / Help	4	335	July 12, 2022
Memory scaling on a single VM causes downtime. Intentional or a bug? Questions / Help	6	358	August 16, 2022
Microservices and deploy/releasing Questions / Help wishlist	2	1680	August 27, 2021
Instance not ready to receive requests	5	516	February 23, 2022
Deployment running for several minutes	5	304	February 4, 2023

how do fly.io deploys work?

Related topics