is `sin` region of fly experiencing networking problems? im getting random tls handshake timeout

vasallius · November 29, 2025, 4:37pm

health checks suddenly got tls and then suddenly was working again

this was one of the most perplexing bugs for me ever

vasallius · November 29, 2025, 4:57pm

random tls handshake timing out, what is happening?

vasallius · November 29, 2025, 4:59pm

has happened in the past yet not reported in status?

PeterCxy · November 29, 2025, 5:02pm

Is it possible to run a mtr or traceroute from your health checks when it detects a handshake timeout? This kind of issue tends to be isolated to single ISPs and we can’t really catch all of them on our side. We do know that our platform itself seems to be okay right now in sin.

vasallius · November 29, 2025, 5:03pm

what confuses me is im able to access the endpoint in my browser across different devices now and even using curl but fly somehow cannot access a different fly machine

vasallius · November 29, 2025, 5:03pm

i’ve also experienced this endpoint not being accessible to device a, but device b yes

same browser, same internet connection

vasallius · November 29, 2025, 5:04pm

and for me just really really perplexing for health check behavior to do that

PeterCxy · November 29, 2025, 5:04pm

Ok, then that sounds like a different problem. I had the impression that your health checks were running from outside Fly.

In this case it could be a single-host issue affecting the host running your check machine. Can you share the name of your app doing these checks?

vasallius · November 29, 2025, 5:04pm

and it’s frustrating because ik this has happened in the past and somehow it’s back again

ive also tried deploying on multiple

vasallius · November 29, 2025, 5:05pm

the name of the app is control-vm which is hitting example machines: 1d-vm, 4h-vm, 1h-vm, etc

ive configured it to hit different endpoints from a diff provider for now because nothing is happening

vasallius · November 29, 2025, 5:06pm

the initial image is app called fk-me (sorry), those are logs from health checks being performed by fly

vasallius · November 29, 2025, 5:08pm

please let me know how else i can help debug, i really wanna deploy my platform on fly

vasallius · November 29, 2025, 5:15pm

this is also confusing because it’s saying not reachable, but as you can see green dot + last health check is passing

vasallius · November 29, 2025, 5:21pm

last concrete example

(control-vm.fly.dev) pinging → 5m-vm.fly.dev

and getting TLS TIMEOUT but i am perfectly able to hit that endpoint

something must be wrong in fly networking (i think) @PeterCxy

pic1: log of control-vm pinging 5m-vm.fly.dev and timing out

pic2: i am able to hit that endpoint perfectly fine

vasallius · November 29, 2025, 5:24pm

Finally reported in status haha

i guess i am 3 for 3 in reporting issues not yet in status page

im glad im not crazy though but really concerns me running production apps in fly

PeterCxy · November 29, 2025, 5:38pm

For more context, this only affects new IPv6s assigned in the past little while, which is also why it didn’t get caught on our side. We do have alerts for networking problems in general, but this one is a little bit weird. We’ll definitely need to include newly-assigned IPs in alerts going forward.

vasallius · November 29, 2025, 5:40pm

all good, stubbornly waiting for fix even if already ~2am here

i still greatly love fly - just a bit more concerned with instability / unreliability

vasallius · November 29, 2025, 6:00pm

looks like it is fixed!!!