Error 2003 in a low usage application (paid)

I’ve noticed an uptick in errors when manually probing my production instance.

The logs state:

Error 2003: App connection closed before request/response completed

I can not see this error reported in the metrics panel under HTTP errors, but it is in the logs. I came across this issue in another thread OOM errors on free tier on a bare application. - Questions / Help - and attempted the suggested fix.

I am not on the free tier. Following @\kurt’s advice, I upped the memory and the issue persists. I have reviewed my application memory usage and it has historically been less than half (it appears) of the 256MB it was allocated. I am now at 512MB and the issue persists and the app does not feel any faster to respond. What steps can be taken on the fly side to help better understand how to prevent this issue?

Also, if this is not clear, it is not every request. I am just trying to understand why this might happen and if its just ‘the weather’ or something that requires metrics or other monitoring solutions.

If your app is running a single instance crashing (an OOM is one reason to crash), it could be responsible for that kind of error. That serves a 503 or 504 status code, I think.

You can see crashing VMs by running:

fly status --all

Look for VMs in the failed state or with more than 0 restarts. If you see one, run:

fly vm status <id>

This will tell you why it crashed. The exit code will typically give you some information, and you may see a message about an OOM.

You can also run this command to see logs specific to that instance:

fly logs -i <id>

If VMs aren’t crashing, I would debug that endpoint directly. That could mean the app’s HTTP server is timing out and closing the connection before it sends a response, or closing the connection before it reads the whole request.

Thanks for these details, will add them to the runbook. I notice these errors tend to occur after long delays in requests.

I followed the instructions you shared @kurt, thank you.

Here are my findings. First I can see the errors from something called proxy I can also see my application logs interleaved.

I assume this proxy sits in front of my application and is attempting to serve this request and failing? When the application receives it no errors or panics are reported.

error.message="App connection closed before request/response completed" 2022-03-25T15:52:24Z proxy[aa23b9a1] lax [error]error.code=2003 request.method="GET" request.url="/ms/pull-request/brewed-on-my-machine""01FZ0WEKMMP14N1YXPJ2QJZGXC-lax" response.status=502
2022-03-25T16:56:59Z app[aa23b9a1] mia [info]{"level":"debug","service":"enterpriseweb","ts":1648227419,"caller":"","message":"scope=pull request order_name=brewed on my machine"}
error.message="App connection closed before request/response completed" 2022-03-25T16:57:04Z proxy[aa23b9a1] dfw [error]error.code=2003 request.method="GET" request.url="/ms/pull-request/brewed-on-my-machine/enter""01FZ1050V9KSDHP4RKEWKPD9YS-dfw" response.status=502
2022-03-25T16:57:12Z app[aa23b9a1] mia [info]{"level":"debug","service":"enterpriseweb","ts":1648227432,"caller":"","message":"scope=pull request order_name=brewed on my machine"}

I believe it can be reproduced by hitting this public URL: → 502 → works

Yes that’s right, those errors are from our proxy layer, and the others are from your app. We set a Fly-Request-Id header you can use to match application logs to proxy logs. You’ll just need to grab it and output it to see.

Those are the same URLs, they both give a 502 for me. Given that it’s relatively quick, my best guess is that the application is closing the connection before it sends a response. If you can make your app log both request start and request finished it might expose the issue.

Awesome at least it is somewhat consistent. I’ll pepper it with logs since its on my side. Thanks.

1 Like