You can run 5 VMs with a hard limit of 1
and it should do what you want. We’ll only send one request to each of those VMs at any given time.
We have some better options for this coming soon. There aren’t many docs, but you can use the fly machines
plumbing to do a lot of FaaS type setups.
Here’s a proof-of-concept proxy that runs on Fly and starts machines when requests come in. These VMs are responsible for stopping themselves when they’re idle. It works really well: GitHub - superfly/machine-proxy: PoC HTTP proxy for scale-to-zero apps via the Fly machines API