scale to zero possible?

Is it possible to scale fly apps to zero? What’s the startup time then?
Can you have a multi-region scale to zero?
That is, configure an app for multiple regions but scale to zero if there’s no traffic.
If scale to zero is avail, how long will an app wait until it scales down?

From previous forum posts, scale to zero is not possible but you can manually pause the app (which causes it to have zero containers running) and you can manually resume the app.

As for multi-region, you can scale down to 1 container in total across all regions, then have balanced scaling which theoretically should scale up from zero in regions that have usage, but you always need at least 1 region with 1 container running.

Correct! You can manually scale to 0, but we won’t autoscale down that low. We have some ideas for this, though, and want to make sure idle apps cost as close to $0 as possible.

Maybe a “-suspendable” flag (or something like “suspendable: true” in the toml)?

Something like that would be handy for experimental/staging apps that could then be automatically suspended after, say, 10 minutes of inactivity. Rather than having to manually suspend them.

The quick and easy way to solve this is to make VMs sleep/resume on HTTP requests, but we actually want it to work for TCP connections, UDP, and even buildkite workers.

There are a ton of interesting apps that are useful for a few minutes or a few hours at a time, and then should basically sleep until the next time someone wants them.

2 Likes

I’m looking for a reasonable way to allow free tenants on my app with some limitations, e.g. auto shutdown, to keep costs manageable.

You can do this with the API right now, but it’ll take a little work. There is a PauseApp mutation you can use to shut apps down, and a corresponding resume call to turn them back on.

If all you’re trying to do is suspend apps that are inactive for a long period of time, and let people turn them back on when they’re ready, that could work pretty well.

We are probably going to tackle auto app “pausing” sometime next month.

Any updates on this?

The API is simpler, there’s a setVmCount mutation that will let you scale to 0 instances. But it’s not magic yet. My “sometime next month” was maybe a little optimistic.

Is there any plan to implement such feature?

Yes! It’s just taking a while to get through our priority queue. This is part of why we let you run 3 instances full time for free. It’s simpler than building auto-scale to 0.

1 Like

Hey, it’s cool that we can scale to zero, that opens up a lot of options. How are IPs handled if we scale to zero - are they at risk of changing/being lost?

My use cases (each customer has at least one fly app):

  • Letting my customer pause their subscription, without losing their IP or fly app so that I can easily resume later
  • Lowering costs on customers with overdue payments, without losing their IP or fly app so that I can easily resume later

Side note: I’d be so happy to pay for IP addresses that I could attach to any app (one at a time), instead of having them strictly exist only for the lifetime of the app.

1 Like

Right now, if you scale to 0 the app keeps its IP. We don’t currently bill for IPs on idle apps, but we will have to start at some point (probably $1/mo).

Reserving IPs to attach to apps later is a good idea and I expect we’ll do it at some point.

1 Like

Works for me, and happy to pay for them any time. The anycast IPs are one of the best features you have and that’s not something you can easily get just anywhere.

How long will it take to bring an app back up if it is scaled to 0? I assume there will be a cold start, but will it still be able to respond to http requests?

If you’re doing it manually (fly scale count 1) it will take a few seconds, and then however long your app takes to boot. The autoscaling could take up to 15s to detect that it needs a new one.

We’re working on this, the goal is to be able to restart apps within a few hundred ms. It’ll be a few months though!

3 Likes

A few months may be well worth the wait if we can autoscale from 0 to 1 in 300ms (correct me if I’m interpreting this wrong).

I don’t know if I have any use case for manually scaling to zero. This would require knowing ahead of time when a client is about to send a request.

My goal is to host a multi-tenant application and when an app is not in use scale it down to zero to save on costs.

i scaled to zero

flyctl status
App
Name = gedw99-image-service
Owner = personal
Version = 14
Status = dead
Hostname = gedw99-image-service.fly.dev

At the moment i have to manually wake it up with:
flyctl autoscale balanced min=1 max=1
Scale Mode: Balanced
Min Count: 1
Max Count: 1

flyctl info
App
Name = gedw99-image-service
Owner = personal
Version = 15
Status = running
Hostname = gedw99-image-service.fly.dev

It would be awesome if it can wake up when its endpoint is hit.

With is possible using various techniques. With NATS for example, the calls in and check if the instance is up, and then start it up, and then once started forward the request.
This is just to illustrate the concept.

If the platform does not support it you can also do this yourself, and do your own management of th fly via your own layer in front of course.

@kurt
But it would be lovely if the platform supported this concept. SO much easier.

@gedw99 are you saying you scaled to zero with flyctl scale count 0 and needed to scale back to 1 by changing the autoscaling config?

Waking up suspended VMs when a request comes in is still something we’re working towards.

1 Like

@michael yes it was to workaround a bug. See issue: app wont stop running according to CLI but dashboard says it is. · Issue #552 · superfly/flyctl · GitHub

Its great to hear you and the team are considering this ! If there is anything you want feedback on feel free to ask.