I just deployed to the region GRU (Brasil) and verified that accessing the app from within the region (I am in São Paulo, and GRU is within São Paulo metropolitan area) I get routed over the US, resulting in access times over 250ms. It seems an IP peering issue between Fly.io (AS40509) and my access provider Telefonica Brasil / Vivo (AS26599). As Telefonica Brasil is the largest access provider in Brasil (with tens of millions of residential and mobile subscribers) this should effect many end users, not reaping the Fly.io value proposition of low latencies.
I am not a network specialist but would like to recommend to Fly.io to consider peering at the major Brazilian internet exchange point, IX.br (IX.br). It has many regional peering locations, but peering at the São Paulo location, and maybe also at Fortaleza, may improve peering in Brazil with literally thousands of small, medium and large autonomous systems (AS) instantly.
Unfortunately my access provider Telefonica Brasil / Vivo (AS26599) does not participate at the internet exchange point, so a private peering agreement would still be necessary for a good connection to the largest Brazilian internet provider.
The issue may seem like not being core to Fly.io, but without good IP peering, Brazilian end users do not benefit from the GRU region, as they get routed over the US and back to GRU.
Following a part of my traceroute. Would also be happy to help, if any local assistance is required.
traceroute to billowing-resonance-7230.fly.dev (109.105.216.97), 64 hops max, 52 byte packets
1 192.168.0.1 (192.168.0.1) 4.939 ms 0.942 ms 0.994 ms
2 * * *
3 201-1-224-0.dsl.telesp.net.br (201.1.224.0) 17.956 ms 6.768 ms 12.084 ms
4 152-255-171-223.user.vivozap.com.br (152.255.171.223) 456.744 ms *
152-255-158-41.user.vivozap.com.br (152.255.158.41) 3.948 ms
5 * * *
6 ge-3-0-2-3606-gralimli4.net.telefonicaglobalsolutions.com (213.140.50.198) 7.491 ms 8.687 ms 5.998 ms
7 5.53.3.143 (5.53.3.143) 117.451 ms 119.090 ms
94.142.98.157 (94.142.98.157) 116.845 ms
8 94.142.118.184 (94.142.118.184) 118.040 ms 119.063 ms 117.173 ms
9 ae-12.sayonara-dorian.r04.miamfl02.us.bb.gin.ntt.net (129.250.9.85) 120.356 ms 120.457 ms 120.408 ms
…
I’m having the exact same problem. The app was deployed to GRU region, and the app’s dashboard on fly.io states that it is there, but ping is quite high, around 250ms. When accessing https://debug.fly.dev/ I get served from region MIA. Extremely disappointing since the main point of fly is deploying close to users.
Sure, no problem. I redacted the IP address and url of my server for security reasons, do you need it? If you do, can I share it with you privately in some way?
traceroute to [REDACTED for SECURITY] ( [REDACTED for SECURITY] ), 30 hops max, 60 byte packets
1 192.168.0.1 (192.168.0.1) 4.334 ms 5.371 ms 5.353 ms
2 10.63.96.1 (10.63.96.1) 16.682 ms 23.559 ms 25.728 ms
3 c91108f9.rjo.static.virtua.com.br (201.17.8.249) 27.447 ms 28.686 ms 28.639 ms
4 c911054e.virtua.com.br (201.17.5.78) 28.622 ms 27.348 ms 30.890 ms
5 embratel-H0-2-0-1-agg01.rjonbf.embratel.net.br (200.179.69.121) 27.772 ms 27.747 ms 26.529 ms
6 200.244.19.151 (200.244.19.151) 136.723 ms 131.553 ms 133.960 ms
7 ebt-B12151-intl01.atl.embratel.net.br (200.230.220.226) 143.832 ms 144.035 ms 136.172 ms
8 ix-hge-0-0-0-11.ecore1.a56-atlanta.as6453.net (64.86.9.93) 243.855 ms * *
9 * * *
10 * ae-7.r23.atlnga05.us.bb.gin.ntt.net (129.250.4.192) 231.671 ms *
11 * * *
12 * * ae-0.a02.asbnva02.us.bb.gin.ntt.net (129.250.5.190) 208.833 ms
13 * * *
14 * * *
15 * * *
16 * * *
17 * * *
18 * * *
19 * * *
20 * * *
21 * * *
22 * * *
23 * * *
24 * * *
25 * * *
26 * * *
27 * * *
28 * * *
29 * * *
30 * * *
I’ve just checked and accessing through my phone’s mobile connection, which is from another ISP, i’m getting the proper region, but through office’s network I still get routed throug mia.
traceroute to debug.fly.dev (2a09:8280:1:763f:8bdd:34d1:c624:78cd), 30 hops max, 80 byte packets
1 2804:14d:5c54:5d34:6802:b8ff:fef7:a3de (2804:14d:5c54:5d34:6802:b8ff:fef7:a3de) 6.367 ms 7.966 ms 7.950 ms
2 * * *
3 2804:14d:5c00:65::1 (2804:14d:5c00:65::1) 30.171 ms 30.155 ms 30.138 ms
4 2804:a8:2:b0::1aba (2804:a8:2:b0::1aba) 27.742 ms 27.727 ms 25.555 ms
5 2804:a8:2:b0::1ab9 (2804:a8:2:b0::1ab9) 28.796 ms 28.781 ms *
6 * * *
7 * * *
8 2001:550:2:19::67:1 (2001:550:2:19::67:1) 216.844 ms 216.822 ms 216.800 ms
9 2001:550:2:4a::10:3 (2001:550:2:4a::10:3) 216.779 ms * 2001:550:2:19::67:1 (2001:550:2:19::67:1) 216.736 ms
10 2001:550:2:4a::10:3 (2001:550:2:4a::10:3) 135.861 ms 137.062 ms 2607:f740:57::5 (2607:f740:57::5) 136.854 ms
11 2607:f740:57::5 (2607:f740:57::5) 139.334 ms 139.314 ms *
12 * * *
13 * * *
14 * * *
15 * * *
16 * * *
17 * * *
18 * * *
19 * * *
20 * * *
21 * * *
22 * * *
23 * * *
24 * * *
25 * * *
26 * * *
27 * * *
28 * * *
29 * * *
30 * * *
This solved it for me. Both my app, deployed to GRU as well as debug.fly.io now have latencies of 10 to 25 ms, mostly 15ms. This is great. Thank you for your quick action.
Just as context. I am within the GRU metropolitan area and my internet is by Telefonica Brasil / Vivo (AS26599) with IPv4.
I just had the idea that it might be interesting for Fly.io users to learn how to integrate RUM (real user monitoring) into their apps, in order to check if all users are served from the closest region. I guess with Anycast routing things may always change dynamically and users served today from the closest region might get served from a distant region tomorrow. Monitoring this would be valuable, and with RUM the user can convince himself that all users are served with low latencies.
You could compare your FLY_REGION environment variable from our instance with the fly-request-id header which contains the edge region (at the end) your user reached.
Thanks, i hope it works! Do you think that something can be done by fly so any user gets routed to the proper region independent on the client’s ISP or is this something that may always have to be fixed on an ISP to ISP basis?
Hey @jerome, do you have any news to share about this problem? The ISP in question still routes to Miami, even though we’re in Brazil and should be routed to GRU. Latencies are around 250ms, and as this ISP is one of the largest in Brazil, a large number of our users would have a terrible user experience. Any update or information would be very welcome, as we’re currently unable to serve decent latencies to any user who uses this ISP.
Since the issue only affects IPv6, the easiest way around this would be to disable IPv6 on your end. For your app specifically, you can remove AAAA and CNAME records and only add an A record pointed at your allocated IPv4 address.
We’re waiting on a peering request with Claro’s upstream AS (Autonomous System) which was forwarded to their local Brazil contact, but we haven’t had any news yet from them.
There is nothing we can do to make this routing better right now. A lot of Claro customers’ IPv6 connections are getting routed to the wrong locations based on my understanding of things. Not just Fly app users.