Logfile spammed: error=fly-proxy-p2p/tls/tcp-backhaul: unexpected end of file

Since last deployment and after rollback we’re seeing the log spammed with

ams [error]
error.message="could not proxy TCP data to/from instance: failed to copy (direction=client->server, op=read, error=fly-proxy-p2p/tls/tcp-backhaul: unexpected end of file)"

Machine is e784961b297338 im ams. Application itself seems to be running (i.e. log output) but not functional.

Update

According to Grafana logs, issue started on 2026-04-09 02:00:00.290 and has been present ever since.

Restoring to older config does not fix the issue, so this used to work before. Any support would be greatly appreciated.

/cc @Lilian

2 Likes

Just a couple of suggestions: scale up your app to add a new instance in the same region; do the logs appear on the new instance? If that does not help, scale up your app to add a new instance in a different region. These are guesses, I admit, but it’s better to try something while waiting for bug-tracing support.

:wave: Is your app still down? From my side it seems like your public URLs are working now, but I don’t know if any specific functionality is broken that’s relevant here.

These logs you have posted can mean a variety of things, ranging from “very bad” to “not a concern at all”. This is because these logs are from the raw TCP (or TLS without HTTP) handler, and with TCP connections we don’t really have much insight into what’s actually going on inside of them. For example, if you are serving HTTP, many clients do not shut down their TCP connections properly (with shutdown()) and instead just close() them; when these connections are served without our http handler, we would report them as TCP EOF errors like shown above.

Looks like we have a zombie machine:

fly m stop e784961b297338
Sending kill signal to machine e784961b297338...
e784961b297338 has been successfully stopped

fly m list
1 machines have been retrieved from app evcc.
View them in the UI here (​https://fly.io/apps/evcc/machines/)


evcc
 ID             │ NAME             │ STATE 
 e784961b297338 │ divine-dawn-3243 │ started

App is working but log still swamped.

Seems- since stop “succeeds” like there’s an issue with the internal fly control plane?

@PeterCxy appreciate your answer, however: we haven’t had application changes for months when we started seeing this in the logs.

This doesn’t need a configuration change to happen. All you need is a random client trying to connect to your published TCP port which doesn’t do shutdown()s cleanly. Your app does have a TCP + TLS port exposed without using our http handler, so that does check out.

About the machine: your app has only one machine, so any incoming requests / connections will automatically start the machine. That is why your machine always comes back almost immediately even if you stopped it manually. If you don’t want autostart, you need to set autostart_machine = false

1 Like