Metrics-based Autoscaling

Previously, you could only autoscale your machines based on connections or requests. That’s great for applications that handle requests but what about applications like background workers that need to scale on the number of pending work items or on queue depth?

Well, good news! We’ve released the fly-autoscaler to let you scale your machines based on any metric.

How it works

The autoscaler collects metrics from Prometheus and reconciles the queried metrics with the number of machines you wish to run. By default, it runs this reconciliation every 15 seconds.

  1. Collect metrics from external systems (e.g. Prometheus)

  2. Compute the target number of machines based on a user-provided expression.

  3. Fetch a list of all Fly Machines for your application.

  4. If the target number of machines is different than the number of running machines, the autoscaler will start or stop machines as needed.

Getting started

You can find steps for setting up the autoscaler yourself in the Usage section of the README. You can also find an example deployment of the autoscaler with some generated metrics & mock workers in the fly-autoscaler-example repository.

The autoscaler supports Expr language expressions so you have a lot of flexibility in how you can set your scaling target. If that’s not enough, the autoscaler is open source so you can fork it and tweak as needed.


The autoscaler is new but we’re work hard on improving it. Some areas that we’re working on improving are:

  • The autoscaler only starts & stops existing machines. We will be adding create/destroy soon.

  • Prometheus is the only metrics source currently but we’ll be adding support for Kafka, Temporal, GitHub Actions, Postgres, Redis, and more.

Please let us know if you have use cases that we don’t support! We’re actively working on improving the autoscaler and welcome feedback.


Added to GitHub - lubien/awesome-flyio-examples: Awesome list of tips and guides that just work ™️ hosting on

1 Like