Running into capacity issues

I’m in the process of migrating a high traffic service from AWS Lambda to fly which requires I spin up 500-1000 machines of the shared-cpu-4x type with 2/4GB of memory.

AWS Lambda which uses a 1 request = 1 container model and the nature of the task also consumes a lot of resources, so I have no choice but to keep this model and use a high number of machines.

While I’d love to migrate this app off AWS Lambda and into Fly, while testing this setup, I encountered some issues and I’d like to ask a few questions:

  1. When scaling VMs in the iad and atl regions, I could not request sufficient capacity. As an example, I could only request ~230 machines in atl for the shared-cpu-4x/4GB machines in atl.

    a. Is there something I need to do to acquire more capacity, such as upgrading to a paid plan?

    b. Alternatively, if it is not possible to acquire, say 500 machines in a single region, is it recommended to spread the required capacity over multiple regions? In this case, will fly automatically route the requests to the region where capacity is available?

  2. When trying to deploy new versions, I’ve also observed that the update might fail midway, only being applied to some containers even with a rolling strategy. As an example, I tried a deployment in the atl region with shared-cpu-4x/2GB memory/440 instances, but this failed midway with the update only being applied to ~420 containers.

    I’m concerned that I may not be able to deploy updates once I productionize my application, so what can I do to ensure that I can update my applications? Should I try with fewer in each region, as I mentioned in 1.b?

Thanks in advance for your help.

I have a few questions -

Are these separate apps, like you have 500-1000 unique lambdas on AWS.

Or is this a lambda that in parallel scale to 500-1000 lambdas?

Few thoughts -

  1. Lambda is not designed to be multi threaded, I am curious why this can’t be done with less instances, and have multiple processes or threads on a machine handle different requests? Go instances I have can handle 10k requests, though that is not heavy workloads.

Fly is more of a traditional VM, not even a container. So you have access to more compute than lambda and I would be surprised if you need that many, unless you are trying to avoid modifying code.

  1. Yes, requests will route across region. The soft_limit and as well latency is used to choose what region to go too. If no machine is available, but you still have extra provisioned, it will start machines up to the hard_limit. Check out concurrency settings -
  1. Where are the requests coming from. Fly puts the VMs at the edge of the cloud, so the reason to use multiple regions is to be closer to end users, but also for high availability.

If you don’t mind sharing, what are you doing that requires that type of compute, and that many instances?

1 Like

It is a single Lambda function that scales up to 500-1000 parallel invocations.

My use case involves running headless browsers to generate reports, and some image manipulation. While most web pages don’t take anywhere this amount of memory, some webpages do, which is why I have to provision a single VM with enough headroom and the 1 request = 1 container model is helpful in terms of performance.

I’m afraid routing many requests onto one container would just lead to resources being exhausted even faster.

Interesting issue. That is a tough problem. How long do the lambdas take now to process a heavy request? Is it headless chrome or something more efficient?

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.