I want to host a phoenix liveview application on fly.io, but I can’t really find guidelines for the expected server sizing. I also can’t find good tools for load testing with liveview.
The app needs to be able to handle 10k concurrent liveview users, which all have about 2 websocket interactions per minute. But it is a quiz-like application, so the websocket interactions are quite clustered in time: one interaction to push the question to the 10k clients (tightly clustered), and one interaction per client to answer (more spread out)
Which server sizing would you recommend?
When working with long running LiveView states, the resource question mostly comes down to RAM. It really depends on how much RAM your processes will consume for what size VM you need.
I know that you can reach 32K per machine concurrent websocket connections as it is today. The current limit on that results for a combination of infrastructure pieces and limits around TLS connections per time frame, network edges and apps. There is currently an effort underway to lift that limit as well. Some of these limits are part of a built-in DOS prevention.
For load testing, I know someone who used https://www.artillery.io/. They used this resource to geographically distribute the load as well. Blitz it with Lambda. Open-source, cloud-native distributed load testing | Artillery. That helped spread it across multiple network edges.
If you run multiple instances you can easily scale beyond 32K connections. With Elixir, the apps can be clustered and PubSub works cross-node as well.
Hope that helps.
Thanks for your response. I did some tests earlier and noticed RAM was an issue and had major improvements using instant hibernation on the erlang processes (with negligible CPU overhead). The 32K websocket limit is interesting, because I was planning to exceed that, so clustering seems a must as well.