As regards the hard_limit, yes if that number is hit, that should trigger a new vm to be created. If you look at your app metrics in the Fly dashboard, you should see a graph of concurrent requests so a new vm would be created if it goes above the set limit.
What complicates autoscaling is there are two modes: standard and balanced. And then the vm distribution depends on another variable: how many regions your app is set to run in. You can see those with fly regions list
. You can see more on this page, and if you scroll down there are various commands to set the options depending how you want it to work;
Upscaling should happen quickly (as soon as Fly spots the increased load), but for downscaling, that I’m not sure. As far as I’m aware you can’t specify a time/rule for when that happens. 5 hours sounds wrong, unless the load justifies it and/or the minimum is now two.
I see autoscaling is being reworked but hopefully someone from Fly can assist: