Rolling my own autoscaling for Fly Machines

jessie · February 22, 2023, 1:55am

My understanding is that apps v2 backed by Machines doesn’t support autoscaling yet. Is there a way to implement my own scale up and scale down to zero based on the requests coming into the app’s load balancer in the meantime?

matt-b · February 23, 2023, 6:48pm

The new Machines still export metrics to the fly-managed Prometheus instance. Perhaps you could poll that and trigger scaling changes based on what it reports for the app?

ignoramous · February 23, 2023, 8:02pm

Even if you did that, the proxy has to load balance traffic. I am not sure if Fly’s proxy is a capable load balancer for Machines, but in our case, it has been erratic ever since the beginning and that hasn’t changed.

One of the more annoying issue we see is, Machines are started-up to serve just one request, taken down by our code after a predefined timeout, but immediately started-up once again, and get sent a single request again! This has cost-implications, but I’ve heard nothing from Fly despite complaining about it over emails and over forums. Over time, I expect things to improve as most of it is in-preview.

Also, if you start-up more than one Machine of a single app in a region, I am not sure what kind of load balancing to expect. It isn’t documented anywhere (that I know of).

kurt · February 23, 2023, 8:08pm

Right now, the proxy only scales is only built to handle the 0 to 1 scaling case. It’s not designed to autoscale across multiple machines. It kinda works, but that’s an unsupported use case at the moment.

When a request or connection comes in, the proxy runs through its load balancer logic, then forwards the user on. Then it ensures the machine is started.

Scaling down is just a matter of exiting, though. If you can teach your process to exit when it’s likely to be idle, you’ll get “scale to zero”. But again, it’s not designed to work with multiple machines.

jessie · February 23, 2023, 8:32pm

Is it possible to run our own reverse proxy like Nginx to work to handle autoscaling on each of the fly apps?

jessie · February 23, 2023, 8:38pm

The way I’m thinking on implementing this is for just scale down (will tackle scale up later) is:

query prometheus metrics for response count every 5 minutes to get apps which are actively receiving requests (however, this isn’t foolproof - i think this only works for if the machines are running http servers)
for “active” apps where the response count > 0, do not scale down. for all other apps, if the number of machines > 0, stop the machines

This will not work for some of our users who will be using machines to spin up web sockets instead of serving http - I don’t think fly emits any metrics which we can use to build a hacky autoscaling here.

kurt · February 23, 2023, 11:09pm

I think for scale down, you can probably just do it from within the Machine. Just wire your app up to detect “idle” and exit with status code 0.

We have a tiny go proxy we use for demos that does this here: GitHub - superfly/tired-proxy: An http proxy that's just too tired and eventually shuts down

jessie · February 23, 2023, 11:54pm

Yeah! Exiting the process from the application layer works well when I have access to the application logic and can detect when it is idle.

However, I also deploy some of my users’ containers and I don’t have access to their source code, which makes this a bit trickier.

kurt · February 24, 2023, 12:01am

Oh got it! An outside supervisor sounds reasonable. You could try injecting your own code into their docker image, that tired-proxy project runs as a supervisor. So you start it, pass your custom command in, and it exits when it’s idle.

jessie · February 24, 2023, 6:41am

I had considered using tired proxy - but it seems to be just for http request. If a user is opening up, say, a web socket, or is running a grpc or graphql server, tired proxy would not work, right?

ccfiel · August 24, 2023, 1:09pm

I am a little bit confused. I thought v2 supports autoscaling? as described in this link: Apps v2 Autoscaling. Did I miss something?

andie · August 24, 2023, 2:08pm

Hi @ccfiel

While you can automatically start and stop Machines for V2 app, it’s not technically “autoscaling”, because we don’t create and destroy Machines. Auto start and stop will stop and start existing Machines based on traffic/load. You can learn all about this feature here:

Topic		Replies	Views
Autoscaling Fly Machines (v2) Questions / Help machines	2	1218	November 6, 2023
Apps v2 Autoscaling Questions / Help machines , autoscaling	4	2162	May 28, 2023
Autoscale from metrics	6	444	November 9, 2023
Machine questions Questions / Help machines	5	995	March 17, 2023
Autoscaling on CPU utilization? Questions / Help	15	2081	May 18, 2023

Rolling my own autoscaling for Fly Machines

Related topics