IP Routing for Brazilian region

Hello Fly.io team.

I just deployed to the region GRU (Brasil) and verified that accessing the app from within the region (I am in São Paulo, and GRU is within São Paulo metropolitan area) I get routed over the US, resulting in access times over 250ms. It seems an IP peering issue between Fly.io (AS40509) and my access provider Telefonica Brasil / Vivo (AS26599). As Telefonica Brasil is the largest access provider in Brasil (with tens of millions of residential and mobile subscribers) this should effect many end users, not reaping the Fly.io value proposition of low latencies.

I am not a network specialist but would like to recommend to Fly.io to consider peering at the major Brazilian internet exchange point, IX.br (IX.br). It has many regional peering locations, but peering at the São Paulo location, and maybe also at Fortaleza, may improve peering in Brazil with literally thousands of small, medium and large autonomous systems (AS) instantly.

Unfortunately my access provider Telefonica Brasil / Vivo (AS26599) does not participate at the internet exchange point, so a private peering agreement would still be necessary for a good connection to the largest Brazilian internet provider.

The issue may seem like not being core to Fly.io, but without good IP peering, Brazilian end users do not benefit from the GRU region, as they get routed over the US and back to GRU.

Following a part of my traceroute. Would also be happy to help, if any local assistance is required.

traceroute to billowing-resonance-7230.fly.dev (109.105.216.97), 64 hops max, 52 byte packets
1 192.168.0.1 (192.168.0.1) 4.939 ms 0.942 ms 0.994 ms
2 * * *
3 201-1-224-0.dsl.telesp.net.br (201.1.224.0) 17.956 ms 6.768 ms 12.084 ms
4 152-255-171-223.user.vivozap.com.br (152.255.171.223) 456.744 ms *
152-255-158-41.user.vivozap.com.br (152.255.158.41) 3.948 ms
5 * * *
6 ge-3-0-2-3606-gralimli4.net.telefonicaglobalsolutions.com (213.140.50.198) 7.491 ms 8.687 ms 5.998 ms
7 5.53.3.143 (5.53.3.143) 117.451 ms 119.090 ms
94.142.98.157 (94.142.98.157) 116.845 ms
8 94.142.118.184 (94.142.118.184) 118.040 ms 119.063 ms 117.173 ms
9 ae-12.sayonara-dorian.r04.miamfl02.us.bb.gin.ntt.net (129.250.9.85) 120.356 ms 120.457 ms 120.408 ms

1 Like

Accessing https://debug.fly.dev/ from São Paulo/Brazil I got served from Miami, with a latency of 130ms.

1 Like

I’m having the exact same problem. The app was deployed to GRU region, and the app’s dashboard on fly.io states that it is there, but ping is quite high, around 250ms. When accessing https://debug.fly.dev/ I get served from region MIA. Extremely disappointing since the main point of fly is deploying close to users.

@Zacour Can you provide a traceroute please? We can likely fix this.

Sure, no problem. I redacted the IP address and url of my server for security reasons, do you need it? If you do, can I share it with you privately in some way?

traceroute to [REDACTED for SECURITY] ( [REDACTED for SECURITY] ), 30 hops max, 60 byte packets
 1  192.168.0.1 (192.168.0.1)  4.334 ms  5.371 ms  5.353 ms
 2  10.63.96.1 (10.63.96.1)  16.682 ms  23.559 ms  25.728 ms
 3  c91108f9.rjo.static.virtua.com.br (201.17.8.249)  27.447 ms  28.686 ms  28.639 ms
 4  c911054e.virtua.com.br (201.17.5.78)  28.622 ms  27.348 ms  30.890 ms
 5  embratel-H0-2-0-1-agg01.rjonbf.embratel.net.br (200.179.69.121)  27.772 ms  27.747 ms  26.529 ms
 6  200.244.19.151 (200.244.19.151)  136.723 ms  131.553 ms  133.960 ms
 7  ebt-B12151-intl01.atl.embratel.net.br (200.230.220.226)  143.832 ms  144.035 ms  136.172 ms
 8  ix-hge-0-0-0-11.ecore1.a56-atlanta.as6453.net (64.86.9.93)  243.855 ms * *
 9  * * *
10  * ae-7.r23.atlnga05.us.bb.gin.ntt.net (129.250.4.192)  231.671 ms *
11  * * *
12  * * ae-0.a02.asbnva02.us.bb.gin.ntt.net (129.250.5.190)  208.833 ms
13  * * *
14  * * *
15  * * *
16  * * *
17  * * *
18  * * *
19  * * *
20  * * *
21  * * *
22  * * *
23  * * *
24  * * *
25  * * *
26  * * *
27  * * *
28  * * *
29  * * *
30  * * *

Thanks, I’ll reach out privately if we need your IP. Asking our network folks to look into it now.

1 Like

@Zacour @Holger can you try this again? We’ve made some routing changes.

@jerome thanks for the support. The result is still the same though.
Getting +200ms pings, and debug.fly.dev still shows mia as the region.

traceroute to *** (***), 30 hops max, 60 byte packets
 1  192.168.0.1 (192.168.0.1)  3.785 ms  3.728 ms  3.708 ms
 2  10.63.96.1 (10.63.96.1)  14.317 ms  21.869 ms  24.523 ms
 3  c91108f9.rjo.static.virtua.com.br (201.17.8.249)  26.539 ms  26.522 ms  26.500 ms
 4  c911054e.virtua.com.br (201.17.5.78)  25.122 ms  26.459 ms  26.440 ms
 5  embratel-H0-2-0-1-agg01.rjonbf.embratel.net.br (200.179.69.121)  25.545 ms  26.633 ms  26.582 ms
 6  200.244.19.141 (200.244.19.141)  29.023 ms  22.262 ms *
 7  ebt-B10-tcore01.rjoen.embratel.net.br (200.230.252.157)  36.454 ms  22.459 ms *
 8  * ebt-H0-1-0-0-agg04.rjo.embratel.net.br (200.244.18.22)  25.559 ms *
 9  peer-B59-agg04.rjo.embratel.net.br (200.211.219.42)  61.135 ms * *
10  * * *
11  * * *
12  * * *
13  * 66.219.163.148.ptr.anycast.net (148.163.219.66)  120.114 ms *
14  * * *
15  * * *
16  * * *
17  * * *
18  * * *
19  * * *
20  * * *
21  * * *
22  * * *
23  * * *
24  * * *
25  * * *
26  * * *
27  * * *
28  * * *
29  * * *
30  * * *

debug.fly.dev :

=== Headers ===
Host: debug.fly.dev
Fly-Request-Id: 01FXR2MFM6CXX705K61NFKW3KR-mia
Via: 2 fly.io
User-Agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/97.0.4692.36 Safari/537.36
Sec-Fetch-Dest: document
Fly-Client-Ip: 2804:14d:5c54:5d34::a60
X-Forwarded-For: 2804:14d:5c54:5d34::a60, 2a09:8280:1:763f:8bdd:34d1:c624:78cd
Fly-Forwarded-Port: 443
X-Forwarded-Ssl: on
Fly-Region: mia
Sec-Gpc: 1
Sec-Fetch-Site: none
Sec-Fetch-User: ?1
Accept-Encoding: gzip, deflate, br
X-Forwarded-Proto: https
X-Forwarded-Port: 443
Fly-Dispatch-Start: t=1646854291078569;instance=f2bfe7cb
Cache-Control: max-age=0
X-Request-Start: t=1646854291078107
Sec-Fetch-Mode: navigate
Fly-Forwarded-Ssl: on
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9
Accept-Language: en-US,en;q=0.9
Fly-Forwarded-Proto: https

=== ENV ===
FLY_ALLOC_ID=f2bfe7cb-2c9e-e31d-f656-17214c989cd5
FLY_APP_NAME=debug
FLY_PUBLIC_IP=2605:4c40:243:8a59:0:f2bf:e7cb:1
FLY_REGION=mia
FLY_VM_MEMORY_MB=128
GPG_KEY=A035C8C19219BA821ECEA86B64E628F8D684696D
HOME=/root
LANG=C.UTF-8
PATH=/usr/local/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
PYTHON_GET_PIP_SHA256=01249aa3e58ffb3e1686b7141b4e9aac4d398ef4ac3012ed9dff8dd9f685ffe0
PYTHON_GET_PIP_URL=https://github.com/pypa/get-pip/raw/d781367b97acf0ece7e9e304bf281e99b618bf10/public/get-pip.py
PYTHON_PIP_VERSION=21.2.4
PYTHON_SETUPTOOLS_VERSION=57.5.0
PYTHON_VERSION=3.10.0
TERM=linux
WS=this
is
a
test
cgroup_enable=memory

2022-03-09 19:31:31.079539468 +0000 UTC m=+7683371.977518365

Oh, you’re hitting our debug app using IPv6 but the traceroute was for IPv4.

Can you provide a traceroute -6 debug.fly.dev?

sure, here it goes:

I’ve just checked and accessing through my phone’s mobile connection, which is from another ISP, i’m getting the proper region, but through office’s network I still get routed throug mia.

traceroute to debug.fly.dev (2a09:8280:1:763f:8bdd:34d1:c624:78cd), 30 hops max, 80 byte packets
 1  2804:14d:5c54:5d34:6802:b8ff:fef7:a3de (2804:14d:5c54:5d34:6802:b8ff:fef7:a3de)  6.367 ms  7.966 ms  7.950 ms
 2  * * *
 3  2804:14d:5c00:65::1 (2804:14d:5c00:65::1)  30.171 ms  30.155 ms  30.138 ms
 4  2804:a8:2:b0::1aba (2804:a8:2:b0::1aba)  27.742 ms  27.727 ms  25.555 ms
 5  2804:a8:2:b0::1ab9 (2804:a8:2:b0::1ab9)  28.796 ms  28.781 ms *
 6  * * *
 7  * * *
 8  2001:550:2:19::67:1 (2001:550:2:19::67:1)  216.844 ms  216.822 ms  216.800 ms
 9  2001:550:2:4a::10:3 (2001:550:2:4a::10:3)  216.779 ms * 2001:550:2:19::67:1 (2001:550:2:19::67:1)  216.736 ms
10  2001:550:2:4a::10:3 (2001:550:2:4a::10:3)  135.861 ms  137.062 ms 2607:f740:57::5 (2607:f740:57::5)  136.854 ms
11  2607:f740:57::5 (2607:f740:57::5)  139.334 ms  139.314 ms *
12  * * *
13  * * *
14  * * *
15  * * *
16  * * *
17  * * *
18  * * *
19  * * *
20  * * *
21  * * *
22  * * *
23  * * *
24  * * *
25  * * *
26  * * *
27  * * *
28  * * *
29  * * *
30  * * *

This solved it for me. Both my app, deployed to GRU as well as debug.fly.io now have latencies of 10 to 25 ms, mostly 15ms. This is great. Thank you for your quick action.

Just as context. I am within the GRU metropolitan area and my internet is by Telefonica Brasil / Vivo (AS26599) with IPv4.

1 Like

I just had the idea that it might be interesting for Fly.io users to learn how to integrate RUM (real user monitoring) into their apps, in order to check if all users are served from the closest region. I guess with Anycast routing things may always change dynamically and users served today from the closest region might get served from a distant region tomorrow. Monitoring this would be valuable, and with RUM the user can convince himself that all users are served with low latencies.

1 Like

You could compare your FLY_REGION environment variable from our instance with the fly-request-id header which contains the edge region (at the end) your user reached.

It appears this might be a problem on the ISP’s side. We’re contacting them to see if they care to fix this.

1 Like

Thanks, i hope it works! Do you think that something can be done by fly so any user gets routed to the proper region independent on the client’s ISP or is this something that may always have to be fixed on an ISP to ISP basis?

Hey @jerome, any news on this? I’m still being routed to miami when accessing from my ISP, which is Claro S.A. (AS28573).

This is probably going to take more time than a day :slight_smile:. I will keep you updated!

Ok, thanks for that :+1:

Hey @jerome, do you have any news to share about this problem? The ISP in question still routes to Miami, even though we’re in Brazil and should be routed to GRU. Latencies are around 250ms, and as this ISP is one of the largest in Brazil, a large number of our users would have a terrible user experience. Any update or information would be very welcome, as we’re currently unable to serve decent latencies to any user who uses this ISP.

Since the issue only affects IPv6, the easiest way around this would be to disable IPv6 on your end. For your app specifically, you can remove AAAA and CNAME records and only add an A record pointed at your allocated IPv4 address.

We’re waiting on a peering request with Claro’s upstream AS (Autonomous System) which was forwarded to their local Brazil contact, but we haven’t had any news yet from them.

There is nothing we can do to make this routing better right now. A lot of Claro customers’ IPv6 connections are getting routed to the wrong locations based on my understanding of things. Not just Fly app users.