I’ve been using Fly for few years (great service guys ) and in the recent months I have seen creating apps and running a machine that use the same image take significantly longer. It is more noticeable with Singapore (SIN) region.
In the past, usually the first app-machine took around maximum 40s to spin up (~600MB) and the subsequent app-machine using the same image took almost half of that. This behaviour is now very random. First app-machine can take up of 75s and subsequent app-machine are randomly created fast or slow.
Here are some benchmark I ran using time
. Each time I change the app name. Other parameters remained the same.
flyctl machine run registry.fly.io/IMAGE --entrypoint --vm-cpus 1 --vm-memory 2048 --debug --org MYORG --region sin -a APPNAME
# SIN
204 0.12s user 0.04s system 0% cpu 75.398 total
204 0.14s user 0.03s system 0% cpu 50.123 total
204 0.11s user 0.03s system 0% cpu 19.369 total
204 0.13s user 0.03s system 0% cpu 51.667 total
204 0.12s user 0.01s system 1% cpu 13.707 total
204 0.14s user 0.02s system 0% cpu 27.022 total
204 0.10s user 0.06s system 0% cpu 46.633 total
flyctl machine run registry.fly.io/IMAGE --entrypoint --vm-cpus 1 --vm-memory 2048 --debug --org MYORG --region iad -a APPNAME
# IAD
204 0.12s user 0.02s system 0% cpu 35.063 total
204 0.14s user 0.03s system 0% cpu 28.688 total
204 0.11s user 0.04s system 0% cpu 26.970 total
204 0.10s user 0.04s system 1% cpu 12.969 total
204 0.14s user 0.02s system 0% cpu 32.710 total
flyctl machine run registry.fly.io/IMAGE --entrypoint --vm-cpus 1 --vm-memory 2048 --debug --org MYORG --region hkg -a APPNAME
#HKG
204 0.14s user 0.02s system 0% cpu 48.602 total
204 0.14s user 0.01s system 0% cpu 15.642 total
204 0.13s user 0.02s system 0% cpu 18.649 total
204 0.13s user 0.02s system 0% cpu 16.461 total
204 0.08s user 0.06s system 0% cpu 15.256 total
I also used the experimental zstd
to compress the images further, it improved the start time but still nothing like before.
Does Singapore region have an on-going issue?
I suspect machines are assigned to random hosts that don’t share the cache and it requires to fetch the layers. Is there anything I can do to improve the creation and start time?