Allowing instances to report their current load

After going through the docs, this is the bigger piece missing for me. Fly can already consume your Prometheus metrics. Add an option for saying which metric is your “load counter” that reports 0-1000?

Concurrent requests isn’t the best indicator for load. Let’s take an app of mine: diagram rendering. The first time a diagram is rendered, it’s (relatively) computationally expensive. But we can cache it indefinitely — so I do. I even distribute the cache to other instances in the same region.

So after the first request, I can have thousands of RPS and be fine with one instance. I really care about the amount of compute going on at a given time on an instance. Ideally this would influence which instance is picked within a region, and with a configurable saturation policy, auto-scale up.


Any interest here? :slight_smile: