Traceroute hops to Seattle for users in South Korea and fly region Narita

echoi · December 26, 2024, 8:59am

Hi,

I noticed that a simple POST request to my server in NRT (I’m in south korea) is experiencing significantly more delay than my expectation so I tried traceroute.

traceroute to my-app.fly.dev, 64 hops max, 52 byte packets
[first 7 hops redacted - local network]

 8  112.174.80.174  163.839 ms  162.803 ms  163.766 ms
 9  te-0-11-0-3-6-pe01.seattle.wa.ibone.comcast.net (66.208.228.45)  169.728 ms  151.755 ms  166.945 ms
10  be-2301-cs03.seattle.wa.ibone.comcast.net (96.110.39.225)  164.305 ms
    be-2201-cs02.seattle.wa.ibone.comcast.net (96.110.39.205)  145.163 ms
    be-2401-cs04.seattle.wa.ibone.comcast.net (96.110.39.229)  145.101 ms
11  be-2413-pe13.seattle.wa.ibone.comcast.net (96.110.44.94)  143.716 ms
    be-2313-pe13.seattle.wa.ibone.comcast.net (96.110.44.90)  119.933 ms
    be-2113-pe13.seattle.wa.ibone.comcast.net (96.110.44.82)  141.594 ms
12  96-87-9-102-static.hfc.comcastbusiness.net (96.87.9.102)  136.545 ms  162.770 ms  158.224 ms

I’m not an expert at this but it seems its hopping to seattle for some reason.

The domain is registered on cloudflare and used on fly single region app in nrt.

Is there something I can do on fly.io to make the hop route better?

anuragbhatia · December 31, 2024, 11:51pm

This route wasn’t making much sense routing wise. I tried to cross check and latest trace I see from Korea Telecom AS4766 is:

traceroute to 77.83.140.34 (77.83.140.34), 20 hops max, 60 byte packets
1 192.168.0.1 (192.168.0.1) 0.425 ms 0.448 ms
2 * *
3 112.188.61.105 (112.188.61.105) 2.290 ms 2.308 ms
4 112.188.53.29 (112.188.53.29) 2.231 ms 2.249 ms
5 112.174.47.49 (112.174.47.49) 9.314 ms 9.332 ms
6 112.174.86.154 (112.174.86.154) 9.571 ms 9.587 ms
7 63-222-57-229.static.as3491.net (63.222.57.229) 34.190 ms 34.187 ms
8 Hu0-0-1-0.br06.tok02.as3491.net (63.218.250.22) 33.734 ms 33.787 ms
9 * *
10 * *
11 103.84.154.10 (103.84.154.10) 38.008 ms 37.968 ms
12 77.83.140.34 (77.83.140.34) 36.826 ms *

It seems fine now. Can you please re-check?

May be you tested when PCCW AS3491 had a broken connectivity to either side: Korea Telecom AS4766 or NETACTUATE AS36236 (upstream in Japan for fly.io).

Let’s cross check for AS3491 since they feed route-collectors (RIPE RIS RRC01/19/23 with their full table) - This shows prefix has been stable & not much w.r.t AS3491 side. So only guess here is something inside AS4766 triggred it. Cannot be sure since it’s full routes are visible at a collector.

echoi · January 2, 2025, 2:12am

Yea it seems it was a temporary issue.

This was brought up while investigating why 0.2-0.5% of my users requests have either been timing out or have unusually long response time, despite my server side logs being fairly consistent without spikes.

I’m fairly new to this topic, is there anything I can do on my side to better point the domain on fly by any chance? Currently my app is only available in East Asia and does get affected by delays over 100-150ms.

anuragbhatia · January 2, 2025, 5:28pm

Routing angle

Internet at large can be wild. These issues can come anywhere though are more common in Asia because generally in US and EU you will find more small to mid sized networks peered with each other. And incase when not, traffic goes via their upstream (often a tier 1 network). And except one known exception at this point, all transit free networks peer with each other and hence indirect paths are not that long. In Asia however many large backbones are known not to be connected and hence impact of any failures on primary paths can cause traffic to go all the way to US.

I’m fairly new to this topic, is there anything I can do on my side to better point the domain on fly by any chance? Currently my app is only available in East Asia and does get affected by delays over 100-150ms.

Not much unfortunately. You can steer your own traffic a bit by way of using other networks/overlays to reach a destination (when you know routing is bad) but for end users hitting your app, your app has to be reachable with decent latency. Your best case would be when fly launches local region near you which will have less changes of bad routing.

echoi · January 3, 2025, 2:28am

That… makes sense… as unfortunate as it is for me.

Thank you for the insight!

system · January 10, 2025, 2:29am

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.