What should I do?
- If your app cares about specific hard limits, you probably already have them set. In your fly.toml’s
services
orhttp_service
concurrency
section (or API calls), great! Do nothing. - If you don’t know or don’t care, leave the hard limit unset (or unset it) in your fly.toml (or API calls). This means you trust us to do the “right thing*”. And if you find out that you need specific limits later, you can always add them! (Either way you will need to do a new
fly deploy
to take advantage of the new “right thing”)
*what changes now is the “right thing”. Before the “right thing” was a hard limit of 25. Now it’s basically unlimited, and in the future we are exploring better analogues to measure “load” than just focusing on how many requests there are.
Fly.io services have a per-instance soft_limit
(please stop sending requests) and hard_limit
(I can not take any more).
When they are unset, they default to 20 and 25 respectively. This is fine for some apps, but turns out that for most stuff that is unnecessarily conservative. It can even sometimes cause some confusion (“why is my instance suddenly not working”), or in the worst case break apps if they receive a traffic spike (“oops 50 people looked at my app and now no one else can while they do”).
The usual suggestion is just “bump your limits” which is also fine, but kinda just shifts the goalposts. How do you know what to set them to? What can my app handle? Is measuring the number of requests or connections even a good analogue for the “load” of an instance? (not always, it turns out!).
Recap: starting today, there are two big changes with how we handle concurrency limits on Fly.io (provided you do a fly deploy
or create a new app ; or create/update machines with the API):
hard_limit
is now optional, like actually optional. If you don’t set it there isn’t a hard limit- The default for apps with neither set is now 20/unlimited respectively