More flexible autoscaling with fly-autoscaler

A few weeks ago, we announced the new fly-autoscaler which lets you automatically scale the number of machines you’re running based on Prometheus metrics. At the time, the autoscaler only scaled up and it relied on machines shutting themselves down when they didn’t have work to do.

That works great for workloads where each machine only runs a few jobs concurrently, however, when your machine handles a large number of jobs at a time then it means that fewer jobs are spread thinly across all your worker machines.

Automatic downscaling

We listened to feedback from some initial users and added downscaling into fly-autoscaler. Previously, you would set a FAS_STARTED_MACHINE_COUNT environment variable and the autoscaler would only scale up your machines to that count.

Now, when the autoscaler detects that you have too many machines running, it will choose one or more machines and request that they stop. This will send a SIGINT to the process on the machine that you can handle to shutdown gracefully.

Flexible scaling ranges

If you are concerned about frequent machine starts & stops from a volatile metric, you can specify a range of machine counts by using FAS_MIN_STARTED_MACHINE_COUNT and FAS_MAX_STARTED_MACHINE_COUNT. These will let you only scale up when the machine count is less than the minimum and it will scale down when greater than the maximum.