How many long-lived websocket connections can I have?

iffy · January 24, 2023, 4:45am

I’m getting ready to beta-test a service that relies on many long-lived websockets connections. Long-lived might be hours (or even days) long. It’s okay if they disconnect – they’ll reconnect. I’d just like to avoid the expensive re-connection every minute.

During testing, we’ll likely have less than a hundred at a time, but with our user base, it could get up to 10k-100k simultaneous connections.

I want to be a good citizen and not blow things up, so:

How many simultaneous connections can a single Firecracker VM support?
Am I okay adding a sub-minute heartbeat to keep the websocket from closing? Or is there a way to configure fly to not close connections after a minute? I’d rather not have a heartbeat, as it seems like a waste of data and I’d rather not make mobile devices or fly constantly send/receive heartbeats.

I’ve read a few other topics like Long-lived TCP connections are dropped and Is it possible to increase the timeout to 120 sec - #4 by ignoramous which makes me think a heartbeat is the best option for keeping it alive. But I haven’t seen any indication of how many open connections is too many.

iffy · January 24, 2023, 3:41pm

Reading Metrics on Fly.io · Fly Docs and then looking at the File Descriptors metric chart on Grafana leads me to believe I can use up to about 20K connections (well, file descriptors, which may not be exactly 1:1).

If this isn’t right, I’d rather be told now than exceed a limit after going live Yes, I’m asking for permission rather than forgiveness.

lillian · January 24, 2023, 3:47pm

20k is a good amount of connections per VM. Adding a heartbeat every ~50 seconds is fine, that’s what I’d recommend doing to keep connections open.

jerome · January 24, 2023, 3:52pm

We recommend scaling horizontally instead of trying to shove too many connections on a single VM. Keeping it under ~30K is probably a good idea.

The number each VM can support depends largely on your application and the size of the VM (CPU and RAM).

That metrics comes “from within” the VM. You have root access to your VM and can change the max open fds (ulimit and all that).

iffy · January 24, 2023, 4:31pm

Alright, thank you! I’ll aim for 20k for now.

The app isn’t yet built to be able to scale horizontally (single SQLite database). Our current plan is exactly to “shove too many connections on a single VM” and see how far it can go with just one I’m trying to delay horizontal scaling until LiteFS - Distributed SQLite · Fly Docs is production-ready, because that will simplify the design.

mwcampbell · March 29, 2024, 12:52pm

I think what we really need to know from Fly is how many simultaneous connections the proxy can handle and will allow, for the organization as a whole, for the application, and per machine. It would be good to explicitly document such limits. That’s one thing I appreciate about AWS documentation. Thanks.

Topic		Replies	Views
Is there a concurrent websocket connections limit? Phoenix elixir	5	3061	December 1, 2021
Maximum amount of connections to a single VM Questions / Help metrics , streams , autoscaling , proxy	4	125	February 2, 2025
Custom inbound TCP port & long living connections	10	950	January 18, 2023
Connection count limits	4	1558	August 7, 2020
How do I know how many connections are open? Questions / Help	19	1475	March 7, 2022

How many long-lived websocket connections can I have?

Related topics