Metrics-based Autoscaling

benbjohnson · February 26, 2024, 10:21pm

Previously, you could only autoscale your machines based on connections or requests. That’s great for applications that handle requests but what about applications like background workers that need to scale on the number of pending work items or on queue depth?

Well, good news! We’ve released the fly-autoscaler to let you scale your machines based on any metric.

How it works

The autoscaler collects metrics from Prometheus and reconciles the queried metrics with the number of machines you wish to run. By default, it runs this reconciliation every 15 seconds.

Collect metrics from external systems (e.g. Prometheus)
Compute the target number of machines based on a user-provided expression.
Fetch a list of all Fly Machines for your application.
If the target number of machines is different than the number of running machines, the autoscaler will start or stop machines as needed.

Getting started

You can find steps for setting up the autoscaler yourself in the Usage section of the README. You can also find an example deployment of the autoscaler with some generated metrics & mock workers in the fly-autoscaler-example repository.

The autoscaler supports Expr language expressions so you have a lot of flexibility in how you can set your scaling target. If that’s not enough, the autoscaler is open source so you can fork it and tweak as needed.

Limitations

The autoscaler is new but we’re work hard on improving it. Some areas that we’re working on improving are:

The autoscaler only starts & stops existing machines. We will be adding create/destroy soon.
Prometheus is the only metrics source currently but we’ll be adding support for Kafka, Temporal, GitHub Actions, Postgres, Redis, and more.

Please let us know if you have use cases that we don’t support! We’re actively working on improving the autoscaler and welcome feedback.

lubien · February 27, 2024, 8:30am

Added to GitHub - lubien/awesome-flyio-examples: Awesome list of tips and guides that just work ™️ hosting on Fly.io

empz · October 8, 2024, 11:55am

Any updates on the roadmap for supporting other metric sources?
I’d love to see some integration around queues/messaging platforms (BullMQ, Redis Queues?)

Topic		Replies	Views
Autoscale from metrics	6	431	November 9, 2023
CPU autoscaling	4	431	June 3, 2021
More flexible autoscaling with fly-autoscaler Fresh Produce autoscaling	0	343	March 20, 2024
Austoscale vs Autostop Questions / Help autoscaling	4	33	October 1, 2024
fly-autoscaler v0.2.1: Scaling via create/destroy, multi-region support Fresh Produce autoscaling	3	280	April 11, 2024

Metrics-based Autoscaling

How it works

Getting started

Limitations

Related topics