We started playing around with fly to solve some scaling issues we currently have. I tried to understand the autoscaling feature. So if I understood correctly at latest when reaching the hard_limit the platform should spawn additional instance(s) up to the max limit?
What about down-scaling?
I triggered an upscale with fly autoscale set max=2 which spawned another instance and set the max back to 1.
This is now ~5h ago but the 2nd instance is still alive. Is there any configurable reaction time for up- and downscaling?
Or am I wrong here and scaling is only for moving between datacenters?
As regards the hard_limit, yes if that number is hit, that should trigger a new vm to be created. If you look at your app metrics in the Fly dashboard, you should see a graph of concurrent requests so a new vm would be created if it goes above the set limit.
What complicates autoscaling is there are two modes: standard and balanced. And then the vm distribution depends on another variable: how many regions your app is set to run in. You can see those with fly regions list. You can see more on this page, and if you scroll down there are various commands to set the options depending how you want it to work;
Upscaling should happen quickly (as soon as Fly spots the increased load), but for downscaling, that I’m not sure. As far as I’m aware you can’t specify a time/rule for when that happens. 5 hours sounds wrong, unless the load justifies it and/or the minimum is now two.
I see autoscaling is being reworked but hopefully someone from Fly can assist:
Ah yes, this was also part of my testings, but lastly I reduced my primary regions to only fra and added some nearby regions to the backup pool:
After setting min=2 there was an instance spawned in a backup region which I understand as part of the distributed scaling of fly. But what I don’t get is why this instance is still (13h! ) alive while there is absolutely no reason to A) keep another instance in general and B) keep another in backup region.
% fly autoscale show
Scale Mode: Standard
Min Count: 1
Max Count: 5
fly autoscale standard min=2 max=5
# Wait some time
fly autoscale set min=1
# Here I expect that the 2nd instance from 1st command gets retired after
# a couple of minutes without having traffic. But does never happen.
fly scale count 0 # or 1, does not matter
fly autoscale standard min=1 max=5
# Now it's correct, I only have 1 instance
So seems like reducing min in an already existing autoscale scenario does not trigger a downscale.