LiveView app unavailable for 2+ minutes after deployment

Regarding the slow propagation issue, my understanding based on this is that scaling to 3 instances is the current workaround. We’ve done this already, as mentioned in my initial post.

Do 100% of new connections / reconnections fail within the those couple of minutes?

That seemed to be the case as of a couple of days ago, as you can see in the screen recording linked from my initial post. Each time I reloaded the page, the static page was served successfully and the websocket connection failed.

Also, how long does it typically take for your server to start and become ready to accept websocket connections? Are they reporting a “healthy” state before they are ready since you are using TCP checks instead of HTTP checks?

It’s a LIveView application, and static pages are being served successfully. In case you’re not familiar with the LiveView lifecycle, from the docs:

LiveView is first rendered statically as part of regular HTTP requests, which provides quick times for “First Meaningful Paint”, in addition to helping search and indexing engines. Then a persistent connection is established between client and server.

As far as I know, once the application is ready to serve static pages, it’s also ready to accept websocket connections. This is the behavior I’ve seen locally and when I’ve deployed on other infrastructure.

Any chance Chris McCord could take a look at this?