Slow response times since yesterday

Hi,

I am getting consistently slow response times from both my apps since yesterday. Nothing changed on my end so I am puzzled about why this suddenly started happening. I checked my app and the DB are still on the same region (and I am requesting from the same region) and put server timings on a route to check if it was the application but it is reporting 18ms response time. So unsure where this extra 800ms is coming in:

My setup is NodeJS to Postgres and the route above is just a simple DB call to fetch some data.

Iā€™ve noticed that even my NextJS frontend calls (which is a different app) are also suffering slow down too. So something has happened to both apps.

Anything else I can debug or check?

I have been suffering from the same all day.

At one point today I was getting near-constant

2023-10-17T16:02:50.830 app[****************] ord [info] 16:02:50.829 [error] Postgrex.Protocol (#PID<0.1998.0>) failed to connect: ** (DBConnection.ConnectionError) tcp recv (idle): closed

2023-10-17T16:02:51.929 app[****************] ord [info] 16:02:51.928 [error] Postgrex.Protocol (#PID<0.1993.0>) timed out because it was handshaking for longer than 15000ms

errors returned from the app (I was able to connect to the db from outside the app just fine throughout). I restarted the DB machine and that went away but the app is still very slow.

I also have the canā€™t deploy problem as detailed here. Unable to deploy due to builder 500 error

Like you, no changes at our end to the app prior to this starting.

I did a bit of digging on this and I am not sure if it is a Fly.io issue specifically.

First of, using my cellular connection I did not have the lagging responses.

I noticed that even my connection (on my ISP) to fly.io was slow, that is the loading of the actual website and assets.

Out of curiousity I did a traceroute to my app and found the packet bounced around Singapore and Australia before it made it back to me! (I am in North America along with my app).

The traceroute to fly.io did exactly the same thing. I think that would explain the long response times Iā€™m getting on my apps. It looks like my ISP thinks the quickest way of getting to any fly subdomain is by going all around the world.

Even firing a simple ping to fly.io gives me 300ms latency whereas any other site is <10ms.

Iā€™m not enough of a network engineer to understand why the route suddenly changed but Iā€™m guessing my ISP messed up.

1 Like

Iā€™m so glad Iā€™m not the only experiencing this ā€” thought Iā€™d been going crazy seeing these large latency spikes! All the exact same symptoms as described here (>200ms pings to my Fly hosted services, as well as fly.io itself), with really weird routes in traceroute ā€” at one point I was seeing packets routed through China. First started seeing this behaviour about a week ago.

One of my temporary solutions has just been to allocate new IP addresses until I get one with a normal latency (i.e. <10ms), but even after doing that a couple times, theyā€™ve all ended up back on these cross-continental network routes after a couple days.

Best solution thus far has definitely just been hopping on a VPN or cellular connection ā€” usually gets me a good enough route (<50ms) but still not optimal.

I originally thought it would have to be something at my ISPā€™s level if I was the only one experiencing this, but if others are seeing it to, thenā€¦ who knows. Iā€™m no network guru.

Though, does kind of defeat the purpose of using Fly to ā€œlaunch apps near my usersā€ if the network trip is halfway around the world :joy:

2 Likes

What Fly-Region is returned if you visit:

https://debug.fly.dev/

Mine is currently being returned as ā€˜melā€™ which is Melbourne Australia, although Iā€™m in Orlando FL. I am currently debugging this with support, and we havenā€™t ruled out my ISP at the moment which is Spectrum.

1 Like

Also getting ā€˜melā€™ at https://debug.fly.dev/

I am in Ontario with Bell.

Thanks for debugging this with support! I tried to get in contact with my ISP but did not get far.

Thank yā€™all for the reports! Weā€™ve started an incident internally and will post updates to Fly.io Status - Requests incorrectly routing to mel region.

3 Likes

With the resolution of Fly.io Status - Requests incorrectly routing to mel region, it does indeed seem fixed for me. Getting the proper yul region for https://debug.fly.dev/ and latency/traceroutes to my services look normal again.

Coincidentally, Iā€™m also in Ontario with Bell :thinking:

Thanks @JP_Phillips and the Fly team for fixing this so quickly!

1 Like

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.