I’m having an issue with on of my apps not deploying into each region specified. I know there’s been a few people reporting similar issues, so I’m not sure if it could be related.
At the moment I have my autoscale configuration set to this:
Scale Mode: Standard
Min Count: 8
Max Count: 50
With these regions:
Region Pool:
ams
cdg
fra
lax
mia
ord
sin
syd
Backup Region:
However the app is re-allocating some of them from other regions into more traffic heavy regions (I think?)
Instances
ID PROCESS VERSION REGION DESIRED STATUS HEALTH CHECKS RESTARTS CREATED
2fc3513c app 33 cdg run running 1 total, 1 passing 0 2022-08-22T10:20:28Z
bb86a6d2 app 33 cdg run running 1 total, 1 passing 0 2022-08-22T10:07:48Z
93039894 app 33 ams run running 1 total, 1 passing 0 2022-08-19T06:41:18Z
c7ac4df9 app 33 ord run running 1 total, 1 passing 0 2022-08-19T06:41:18Z
af4e30f1 app 33 ord run running 1 total, 1 passing 0 2022-08-19T06:41:18Z
66a3b4a3 app 33 fra run running 1 total, 1 passing 0 2022-08-19T06:41:18Z
7b8e9014 app 33 lax run running 1 total, 1 passing 0 2022-08-19T06:41:18Z
4de47a87 app 33 sin run running 1 total, 1 passing 0 2022-08-19T06:41:18Z
I haven’t made any adjustments to scaling and I don’t have any volumes. Is there anything I need to do in order to get this working correctly? Any help would be appreciated
Based on the previous threads @ignoramous helpfully linked, it does seem that the Nomad-based scheduler hasn’t always spread instances evenly across configured regions as expected.
The Nomad scheduler considers several combined factors when making placement decisions, so it might be that some of the other regions were closer to their capacity limits. Or it could be some other bug in Nomad or our configuration that we haven’t tracked down yet.
We’re working on replacing the Nomad scheduler with our own Machine-based scheduler, which will eventually give us more control and working knowledge of the internals, making it easier for us to track down scaling/placement issues. Until then, you can try using --max-per-region as suggested to manually spread out instances, though that won’t be too useful if you also want to autoscale beyond current instance counts.
Yeah, enabling and disabling don’t make a difference.
Abandoning autoscale would be frustrating as I’ll have to overprovision servers to allow for periods of high demand. I might just have to live with the current behaviour
It’s great to hear that a custom scheduler is being worked on to apply the expected behaviour. It would be good to put this into the documentation though as it doesn’t work the way it’s described. I’m also now a little confused on how standard auto-scaling it supposed to work and by extension balanced autoscaling.
I think you’ve heard similar feedback before but I would like to set the scaling behaviour so that there is a minimum of one server in the regions that I set with servers being added when demand increases in a particular region(how I assumed that the Standard autoscaling worked originally)
Fly Machine apps can be deployed to all regions. Since Machine app instances are created on-demand in response to user requests (in a region closest to the user), they are more cost-effective. These instances typically cold-start in 300ms. It is upto the instance’s main process to exit itself once it has serviced the request. For example, I have setup one of our Machine apps to terminate after 5m of inactivity. This blog post has more hands-on details on Fly Machines: Building an In-Browser IDE the Hard Way · The Fly Blog
At the outset, the stuble behavioural differences between regular Fly apps and Fly Machine apps might confuse / frustrate you, but once that novelty wears off, things start making more sense.
I had a quick look but from what’s described I don’t think it will suit my use case. My app is based on websockets and this seems to be for short-lived requests. It does look nice though
I’ve been experimenting with the autoscaling and interestingly, if I increase the min scale by 1 multiple times it honors the standard auto-balancing eventually putting one in each region and this seems to be maintained between deployments.
I did try the --max-per-region with count but it looks like if there’s already more than 1 machine to a region it won’t scale down to match. For instance, I had 4 machines in ams but once I set --max-per-region to 3, there was still 4 in ams instead of 3.
Not really, it is upto the entrypoint process to keep around a Machine alive for as long as it wants to (including leaving it up forever). For ex, the Machines I run are set to go down after 180s of inactivity. From the logs, I see that a couple Machines have been up for 12 hours now, presumably since there might have been no inactivity long enough to trigger a self-exit.