Is it possible to have a concurrency of 1 per container? In other words, a single request would be handled per container; I need this because the task is not parallelizable in a single container.
Is it possible to scale to a large number of containers (e.g. 1000 instances)?
The quick answer to each of your questions is “yes,” with a caveat on #2.
- Yes. Instances on fly.io are Firecracker VMs built with your container image. You can set a hard limit to restrict VMs to a single in-flight request: App Configuration (fly.toml)
- Yes, you can scale to a large number of VMs. However, our autoscaling probably won’t do what you want; it happens every 15s, which is very slow if you need to add an instance in response to each request.
We do have a demo open source proxy that could be adapted to this: GitHub - superfly/machine-proxy: PoC HTTP proxy for scale-to-zero apps via the Fly machines API. This requires you to implement logic to start and stop machines based on requests, though.
Thanks for the response!
So if I understand correctly, I can work around the autoscaling delay by writing a proxy that takes requests and spawns containers on fly.io? When using the APIs, what would be the expected delay for a new container? I could probably work with a delay of ~5s.