proxy error.message=["Undocumented"]

lanny.bose · September 8, 2021, 5:41pm

Hello Fly!

I’m getting some (concerning?) error messages rolling through my fly logs:

2021-09-08T17:29:59.342616396Z proxy[628be8a1] chi [error] error.code=1 error.message="Undocumented" request.method="GET" request.url="/socket/websocket?[PARAMS REDACTED]" response.status=502

My app lives in ewr right now but I’m getting the errors in chi and dfw, which is where most of my customers are located.

Is this a config issue? Websockets? Thanks for the helping hand.

Lanny

lanny.bose · September 8, 2021, 6:03pm

Hmm… so I’ll just rubber duck myself on this one.

My fix (for the moment) appeared to be to adjust my concurrency limit in fly.toml. My Fly dashboard showed me maxed out of the copy-pasted hard limit of 25 due to persisting websocket connections, and I think it’s possible the Fly routing layer wasn’t letting anyone else to my app server at that limit.

For the Fly folks, if that’s true I think it would be helpful to document the potential consequences of this (especially given how much Phoenix folks love their websockets ) in the fly.toml Docs.

kurt · September 8, 2021, 6:38pm

Normally if you’re hitting concurrency limits, you’ll see a message saying “Hard limit reached” in the docs. The undocumented error might be related to it being a websocket connection, though. If it errors after the upgrade that could be confusing our error handling.

We should definitely expand the Phoenix docs with notes about websockets, that’s a good idea.

lanny.bose · September 8, 2021, 6:44pm

Kurt, thanks for replying!

My production app is a pretty small business, so I feel like I’m unlikely to really run into (non-self-imposed) ceilings here. And, I don’t want to foot-gun myself again on this.

I can see some potential outs here:

Actually set up autoscaling so if I do hit the limit that traffic will have somewhere to go
Remove the limits altogether and let a container being out of memory (??) be my limit
Set a limit that’s kind of preposterously high (for my customer base at the moment) and have autoscaling be my last resort in case Elon mentions me on Twitter or something

Do you have a recommendation there? Or at least some metrics to watch as I plot next moves?

kurt · September 8, 2021, 6:48pm

I’d got for option 3. Even small Elixir VMs can handle ~500 concurrent connections, so setting the hard limit to 500 or so seems safe. Preposterously high would be like 2500.

Autoscaling is slow to start Elixir nodes so it probably won’t help much right now.

lanny.bose · September 8, 2021, 7:43pm

What’s the difference in behavior between soft cap and hard cap? Is it something like soft cap = spin up a new instance, hard cap = start rejecting requests?

kurt · September 8, 2021, 8:36pm

Yes that’s it exactly. Scaling is metrics based, so it’s a little lagged, but if your VMs average more than the soft limit it’ll scale up. So you could set a lowish soft cap, then a high hard cap to give it a buffer while it scales.

jerome · September 8, 2021, 8:43pm

“undocumented” errors are just “undocumented” . There are a many scenarios where this can happen.

Can you provide the request.id of that log line you pasted? I might be able to dig into it and find out what the actual error was and perhaps document it

lanny.bose · September 8, 2021, 9:02pm

Replied via DM

kurt · September 9, 2021, 12:37am

For future reference, request IDs are (intentionally) safe to share.

lanny.bose · September 9, 2021, 2:09am

Hah, fair. Thanks again, Kurt!

Topic		Replies	Views
Undocumented Error Fly Proxy Questions / Help	2	322	June 7, 2021
"Error 1: Undocumented" after deploy & missing logs Questions / Help	8	1193	February 23, 2022
Weird log messages	5	322	April 5, 2022
Error logs saying "Internal problem" result in 502s	10	454	August 16, 2021
[solved] Error 502 Undocumented errors: different region	2	388	April 19, 2022

proxy error.message=["Undocumented"]

Related topics