Currently the autoscaler is based on the number of connections.
Do you have any plans to support alternative means of autoscaling? It would be awesome if there were options to scale horizontally (more instances) and vertically (more CPU/MEM).
We vaguely plan to bake this in. The current autoscaler is actually not what most people want, so we’re starting by tackling that. What they usually want is for machines to start on demand when there are a certain number of connections.
Scaling with different strategies is something I think people might end up building with the machines API. You can run a machine that wakes up every so often, and then starts other machines for you. Or run one that continuously polls metrics and starts and stops other things.
I experimented with a Machine that didn’t run a server, but just a process (ie, no [[services]]). I don’t remember being able to start it with the fly m start command. It was tore down everytime. But if I understand you right, this usecase is supported (but it doesn’t / didn’t work…)?
Autoscale, believe it or not, is/was one of the most interesting among Fly’s features which sadly isn’t getting the eng love it deserves.
In fact, the Fly proxy needs to observe and react in-time to (an app’s / a process’ / a machine’s) increasing / decreasing requests / connections wrt soft_limit and hard_limits, which in my experience, it doesn’t quite do so. I imagine it is a hard problem (and relates to autoscaling in some sense) no doubt… but one that needs solving.
I was about to post a question regarding this. When we’re dealing with worker apps, that have no public connection to the outside work, the requirements to scale these worker apps have nothing to do with the number of connections or requests.
A stream processing worker for instance, might need to scale up if the number of unprocessed messages in the stream suddenly increases. It seems the only way to do something near this is to write a worker app that periodically checks these conditions and uses machines api to scale up or down, though, this seems like a lot of trouble.
Since custom metrics are already baked in, I think that it would make sense to use them. Maybe being able to define thresholds based on custom metrics.
Yes, we hear you on this, I think the use case you described is very clear. Autoscaling based on prometheus metrics is something we’d like to eventually support in the platform.
Yeah, that’s the current do-it-yourself approach that works today. It’s not ideal, but if you want to go this direction here’s a pointer to a Bash-script example of this that could help you get started:
I was actually thinking that since I already have a worker app running that has access to these metrics (since it produces them), I might try to connect to the machines api directly - the app scaling itself
It’s not ideal, for sure, but until custom metrics are supported, it might be an option.