New Concurrency Hard Limit Default

What should I do?

  1. If your app cares about specific hard limits, you probably already have them set. In your fly.toml’s services or http_service concurrency section (or API calls), great! Do nothing.
  2. If you don’t know or don’t care, leave the hard limit unset (or unset it) in your fly.toml (or API calls). This means you trust us to do the “right thing*”. And if you find out that you need specific limits later, you can always add them! (Either way you will need to do a new fly deploy to take advantage of the new “right thing”)

*what changes now is the “right thing”. Before the “right thing” was a hard limit of 25. Now it’s basically unlimited, and in the future we are exploring better analogues to measure “load” than just focusing on how many requests there are. services have a per-instance soft_limit (please stop sending requests) and hard_limit (I can not take any more).

When they are unset, they default to 20 and 25 respectively. This is fine for some apps, but turns out that for most stuff that is unnecessarily conservative. It can even sometimes cause some confusion (“why is my instance suddenly not working”), or in the worst case break apps if they receive a traffic spike (“oops 50 people looked at my app and now no one else can while they do”).

The usual suggestion is just “bump your limits” which is also fine, but kinda just shifts the goalposts. How do you know what to set them to? What can my app handle? Is measuring the number of requests or connections even a good analogue for the “load” of an instance? (not always, it turns out!).

Recap: starting today, there are two big changes with how we handle concurrency limits on (provided you do a fly deploy or create a new app ; or create/update machines with the API):

  1. hard_limit is now optional, like actually optional. If you don’t set it there isn’t a hard limit
  2. The default for apps with neither set is now 20/unlimited respectively