Is there a concurrent websocket connections limit?

Hi there. I have an app that is built with the goal of handling a lot of concurrent websocket connections (over a cluster and no RDB). Unfortunately I’m not ready to share it, but it’s an Elixir Phoenix LiveView app with a main page that anyone can edit and receive updates.

As an example, imagine having a single chat room that anyone can enter and post comments.

Are there limits to how many concurrent connections a single server can handle? Does Fly throttle the number connections in any way? If I recall, Heroku sets a limit at 50, which is why I ask.

If there are no limits, then is the bottleneck simply available app memory?

I sent a similar email to Fly support, but it probably makes more sense to ask here. Thanks!

2 Likes

The concurrency of your service is configurable here App Configuration (fly.toml) — and Fly will try to respect those numbers as much as possible.

That said it’s obviously not a good idea to set it to infinity or billions — but if you are running a server that’s capable of handling more connections and you’re pushing against Fly’s internal limits you can always start a conversation here.

Do you plan to many small servers or a few large server in this case?

her @sudhir.j, thank’s for the reply. I’m planning on having servers scattered around the globe. Haven’t made a decision on size, but is there a good way to estimate or calculate that? This is perhaps a question more suited for the Elixir forums.

I’m wondering in the off chance that I get 100 concurrent connections, would the app crash. But it sounds like a soft/hard limit of 50 would be reasonable?

I did a bit of sleuthing (maybe the results will be useful for the next person who asks this question) and:

  1. I’m using the default instance size (I’d be okay to temporarily upgrade):
#  flyctl scale show

VM Resources for gems
        VM Size: shared-cpu-1x
      VM Memory: 256 MB
          Count: 4
 Max Per Region: Not set
  1. Looking at Observer (Elixir) I was able to that:
  • the base memory consumption when the app is running with no-connections is around 60mb.
  • with 15 tabs opened (aka: 15 connections) roughly 70mb of memory is consumed (screenshot below).

Although there is no guarantee that prod would run the same as a local instance, if we did the math then: (70mb - 60 mb)/15 connections = 0.67 mb a connection. This is when there are no active clicks and the page is statically running.

When triggering clicks in the app as fast as I could (meaning a Phoenix LiveView click event) it bumped up to 80mb. Not sure how that would scale with the number of connections (but each connection would receive an update event).

I’m inclined to believe that an VM instance size of 256mb would handle 50 connections easily.

What else am I missing or not accounting for in my (very approximate) calculation?

That sounds about right, although I’d suggest running local tests with a few more connections and running the same tests on Fly as well.

If your app pushes past the allowed memory, it’ll be terminated, yes. So it might make sense to run a couple of tests for an actual instance on Fly (on the actual VM, so it takes into account the OS as well) and then set your hard limit a little lower than what the app can manage.

The numbers you’re looking should be well within the default Fly limits, but if you do push against them you can start a conversation here.

We so have some ideas on paid plans coming up that let you set custom limits as well —Coming soon: paid plans and support, oh my — but in this case I doubt you’ll need that.

1 Like

I have a question regarding testing on Fly.

The service concurrency docs you linked to say that “the system will bring up another instance.” However, by default “auto scaling” is disabled (auto-scaling docs). Is also necessary to turn auto-scaling ON (to standard, for example)?

When testing the autoscaling, what’s the best way to verify that a new instance has been started? Just flyctl status?

Most people have had better success leaving autoscaling off and running one VM per region. When a VM in one region hits the hard limit, the next connection in that region goes to the nearest available VM.

Your VMs can definitely support 50 concurrent websocket connections on those small VMs, especially if they’re somewhat idle. You can potentially handle 1000 concurrent connections on those small VMs.

Autoscaling isn’t fast enough to bring up new instances for Elixir apps. We have some big plans for making this nicer but it’s going to take a while.

1 Like