Scaling limits on fly.io

I’m trying to see if fly.io can be a good alternative to Google’s Cloud Run and I wanted to ask if these two features are available in fly.io:

  1. Is it possible to have a concurrency of 1 per container? In other words, a single request would be handled per container; I need this because the task is not parallelizable in a single container.

  2. Is it possible to scale to a large number of containers (e.g. 1000 instances)?

Hi @Biswas,

The quick answer to each of your questions is “yes,” with a caveat on #2.

  1. Yes. Instances on fly.io are Firecracker VMs built with your container image. You can set a hard limit to restrict VMs to a single in-flight request: App Configuration (fly.toml)
  2. Yes, you can scale to a large number of VMs. However, our autoscaling probably won’t do what you want; it happens every 15s, which is very slow if you need to add an instance in response to each request.

We do have a demo open source proxy that could be adapted to this: GitHub - superfly/machine-proxy: PoC HTTP proxy for scale-to-zero apps via the Fly machines API. This requires you to implement logic to start and stop machines based on requests, though.

1 Like

Thanks for the response!

So if I understand correctly, I can work around the autoscaling delay by writing a proxy that takes requests and spawns containers on fly.io? When using the APIs, what would be the expected delay for a new container? I could probably work with a delay of ~5s.