Scaling speed

I’m trying to get a sense of how quickly autoscaling works without setting up a test environment myself. I expect to have very peaky loads (0 → thousands of requests in a few seconds). Can I expect the load balancer to observe the request limits/instance, queue the unhandled incoming requests, spin up new instances required to serve those requests, and serve those requests within a few seconds?

Sounds like the answer is that time to scale is on the order of minutes, not seconds: How to enable autoscaling? - #4 by eli

Nomad’s autoscaler is slow, yes. The Machines equivalent is much better since it figures out whether it needs more instances at every HTTP request: Automatically Stop and Start App V2 Fly Machines · Fly Docs

So is it correct to say that with v1, it’s on the order of minutes, and with v2, it’s on the order of seconds?

Machines/ apps v2 do almost exactly this:

Can I expect the load balancer to observe the request limits/instance, queue the unhandled incoming requests, spin up new instances required to serve those requests, and serve those requests within a few seconds?

When Machines are at their connection/request limits, the proxy will start other machines. This happens in <300ms, though your app process may take longer to boot.

One important thing here: we don’t create new instances for you. Machines need to be created ahead of time. You control the max scale by creating as many machines as you want to have running at any given time. We charge you when they’re started. When they’re stopped, we only charge for the root image storage.

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.