Based on the previous threads @ignoramous helpfully linked, it does seem that the Nomad-based scheduler hasn’t always spread instances evenly across configured regions as expected.
The Nomad scheduler considers several combined factors when making placement decisions, so it might be that some of the other regions were closer to their capacity limits. Or it could be some other bug in Nomad or our configuration that we haven’t tracked down yet.
We’re working on replacing the Nomad scheduler with our own Machine-based scheduler, which will eventually give us more control and working knowledge of the internals, making it easier for us to track down scaling/placement issues. Until then, you can try using --max-per-region as suggested to manually spread out instances, though that won’t be too useful if you also want to autoscale beyond current instance counts.
It’s great to hear that a custom scheduler is being worked on to apply the expected behaviour. It would be good to put this into the documentation though as it doesn’t work the way it’s described. I’m also now a little confused on how standard auto-scaling it supposed to work and by extension balanced autoscaling.
I think you’ve heard similar feedback before but I would like to set the scaling behaviour so that there is a minimum of one server in the regions that I set with servers being added when demand increases in a particular region(how I assumed that the Standard autoscaling worked originally)
Fly Machine apps can be deployed to all regions. Since Machine app instances are created on-demand in response to user requests (in a region closest to the user), they are more cost-effective. These instances typically cold-start in 300ms. It is upto the instance’s main process to exit itself once it has serviced the request. For example, I have setup one of our Machine apps to terminate after 5m of inactivity. This blog post has more hands-on details on Fly Machines: Building an In-Browser IDE the Hard Way · Fly
At the outset, the stuble behavioural differences between regular Fly apps and Fly Machine apps might confuse / frustrate you, but once that novelty wears off, things start making more sense.
I’ve been experimenting with the autoscaling and interestingly, if I increase the min scale by 1 multiple times it honors the standard auto-balancing eventually putting one in each region and this seems to be maintained between deployments.
I did try the --max-per-region with count but it looks like if there’s already more than 1 machine to a region it won’t scale down to match. For instance, I had 4 machines in ams but once I set --max-per-region to 3, there was still 4 in ams instead of 3.
Not really, it is upto the entrypoint process to keep around a Machine alive for as long as it wants to (including leaving it up forever). For ex, the Machines I run are set to go down after 180s of inactivity. From the logs, I see that a couple Machines have been up for 12 hours now, presumably since there might have been no inactivity long enough to trigger a self-exit.