I deployed a testing website to Fly. I set the hard_limit to 25 (default value). However, when I visit the site, I received the warning Instance reached connections hard limit of 25. It surprises me a bit.
[services.concurrency]
hard_limit = 25
soft_limit = 20
type = "connections"
Is the connections the same as visitors? Like 1 visitor = 1 connection, 20 concurrency connections = 20 visitors at the same time
How do I know how many connections one visitor could produce and how to properly set the value?
When using connections type of concurrency (the default), then every HTTP request proxied to your app instances create a new connection. So 1 HTTP request generally maps to 1 TCP connection to your app.
If you want to pool connection, you can use the requests type of concurrency. This doesn’t work perfectly with all backends. Our pool idle timeout is very low (4s), if your own timeout is equal or lower, race conditions can happen. This is rare, but it does happen.
In any case, if your app can handle more than that, you can bring those limits up!
1 HTTP request at the edge == 1 TCP connection to your app
The connection to your app is closed as the HTTP response finishes
requests concurrency
1 HTTP request at the edge == 1 HTTP request to your app
Connections are pooled and reused across multiple requests
That chart always shows the concurrency, regardless of the “type” of concurrency.
Depends on the size of your instance and the kind of traffic you’re getting. You can “play” with the concurrency limits. Try soft: 75 w/ hard: 100, then you if your app reaches those levels of traffic, you can look at metrics like response times and resource usage (CPU, memory, etc.) to further tweak.
That chart you posted shows 1 concurrent connection. You’re seeing “hard limit reached” errors? That would suggest you reached 25 concurrent connections at some point.
Does every visit to your app make a lot of requests at the same time? In that case you’ll definitely need a higher limit.
I’m having a hard time knowing what are these limits, and how I should interpret these.
I’m testing my staging and prod deployment, and I alone seem to reach this hard limit after some refresh here and there. How could it be? Is that expected?
Also would be interested in a bit more clarity around the concurrency and how it’s handled, what "really’ to look for when it happens.
I have a dev server on a shared-cpu-2x:512MB that hit concurrency limit of 25, after about 3 - 4 visits.
It’s running a super lean remix app with prisma / postgres, but i’ve had other sites with similar setups never hit limits or issues on shared-cpu-1x:256MB
I feel like the connection and request concurrency connotation should be swapped, but that’s probably just me not understanding it fully.
We switched to request based concurrency because i understood that each concurrent request will increase that limit. Now reading that type=request actually looks at pooled connections (which may transport multiple requests) feels weird.