Can't access Fly-hosted sites from specific computers

Super lame I have to expose personal info here, since Fly won’t respond to technical support-related issues over email despite having spent $$$$, but here we are…

I’m having the weirdest problem accessing Fly-hosted sites with a small number of computers. (I can reproduce the issue on one machine, and have had the same report from a handful of customers, as well.) This issue has gone on for months, and after a ton of debugging, I’ve narrowed down the root cause to be Fly for some reason.

Here’s an example:

  • Can’t access https://watilo.foliohd.com - just times out (ERR_NAME_NOT_RESOLVED)
  • Tested by trying to hit the root Fly subdomain: https://damp-morning-5422.fly.dev/
    • You and I will see a 404 page on FolioHD saying there’s no site found. This is the expected behavior, but what the users in question get is the browser ERR_NAME_NOT_RESOLVED screen.
  • I thought it was specific to FolioHD, but then I tested from the problematic machine on a Posthaven URL (which we also run on Fly): https://posthaven-prod.fly.dev/
    • Again, you and I will get a 404, but the problematic machine doesn’t even load that.

I have previously tried every troubleshooting step imaginable: checked hosts file, disabled firewall, tried incognito, enabled a VPN, tried multiple internet connections - all the same result. I’m even on the same network as other machines that can access the sites - it’s just this specific computer (MacOS, latest).

Is there some sort of blacklist that Fly maintains, where this particular machine is getting blocked for some reason?


definitely puzzling! happy to help troubleshoot, and thank you for frontloading all that context

As an aside, if you’re looking for email support, you might take a look at our paid plans – could be a good option if your anticipated spend aligns with the correspondingly expanded usage quota.

That said, we definitely don’t want anyone to have a fly-related problem for months! So if you do feel uncomfortable sharing certain info in the forum, you can redact your problem description-- if it’s essential info that you absolutely cannot share we’d be happy to receive that over email if needed.

This does have an important drawback-- other users will be less able to help out. Of course, as with the better part of issues on our end, that’s somewhat less of a disadvantage :slight_smile:

Anywayy, on to the actual problem! Being able to reproduce this is a huge help, thank you! You mentioned that you have unaffected machines on the same network as an affected one–even better. A few things I’m curious about, that might help us narrow it down:

  • Can you curl those sites with the flyio-debug: doit header? This will, among other things, give us an idea where traffic is coming in from that network.
  • You’ve probably already done this, but what does dig say about those subdomains from the affected machines? Is it returning the same answers that you get on unaffected ones? How about if you use a large public resolver like 8.8.8.8?
  • I’m guessing the answer is “no, so far” but are you able to resolve any fly.dev domains from the problem clients? Are they able to hit debug.fly.dev?
2 Likes

Agree with all of the above from @eli. In addition to those suggestions, personally I’d also try temporarily turning off any/all browser plugins you may have on that particular machine (adblock, umatrix etc). Those can block requests.

And clear any DNS cache (as given that error message, it would appear to be a DNS issue). Visit chrome://net-internals/#dns then restart the browser.

3 Likes

Yeah I’ve already tried 8.8.8.8, no luck.

Result of curl (on the debug URL)

$ curl https://debug.fly.dev -H "flyio-debug: doit"
curl: (6) Could not resolve host: debug.fly.dev

Result of dig on debug.fly.dev

$ dig debug.fly.dev

; <<>> DiG 9.10.6 <<>> debug.fly.dev
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 10319
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 1232
;; QUESTION SECTION:
;debug.fly.dev.			IN	A

;; ANSWER SECTION:
debug.fly.dev.		3600	IN	A	77.83.140.164

;; Query time: 55 msec
;; SERVER: 192.168.4.1#53(192.168.4.1)
;; WHEN: Fri Jul 08 10:25:21 EDT 2022
;; MSG SIZE  rcvd: 58

Result of dig on watilo.foliohd.com

$ dig watilo.foliohd.com

; <<>> DiG 9.10.6 <<>> watilo.foliohd.com
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 56314
;; flags: qr rd ra; QUERY: 1, ANSWER: 2, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 1232
;; QUESTION SECTION:
;watilo.foliohd.com.		IN	A

;; ANSWER SECTION:
watilo.foliohd.com.	3600	IN	CNAME	damp-morning-5422.fly.dev.
damp-morning-5422.fly.dev. 3600	IN	A	213.188.213.51

;; Query time: 166 msec
;; SERVER: 192.168.4.1#53(192.168.4.1)
;; WHEN: Fri Jul 08 10:25:07 EDT 2022
;; MSG SIZE  rcvd: 102

Can’t access debug.fly.dev in the browser from the affected machine

Re: DNS cache, I tried that, but also it doesn’t seem to be browser-specific. (Same result in Chrome, Safari, Firefox, and also in incognito with no extensions enables.)

From the machine where I can access debug.fly.dev

=== Headers ===
Host: debug.fly.dev
Sec-Fetch-User: ?1
Sec-Fetch-Dest: document
Accept-Encoding: gzip, deflate, br
Accept-Language: en-US,en;q=0.9
Fly-Forwarded-Ssl: on
X-Forwarded-Ssl: on
Sec-Ch-Ua: " Not A;Brand";v="99", "Chromium";v="102", "Google Chrome";v="102"
Sec-Ch-Ua-Platform: "macOS"
X-Forwarded-Port: 443
Fly-Client-Ip: 47.200.195.8
Fly-Forwarded-Proto: https
X-Forwarded-Proto: https
Fly-Request-Id: 01G7F3209M0B5DD2M2VTYDQFQ3-mia
Sec-Ch-Ua-Mobile: ?0
X-Request-Start: t=1657290162484361
Sec-Fetch-Site: none
X-Forwarded-For: 47.200.195.8, 77.83.140.164
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/102.0.0.0 Safari/537.36
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9
Fly-Region: mia
Via: 2 fly.io
Sec-Fetch-Mode: navigate
Fly-Forwarded-Port: 443

=== ENV ===
FLY_ALLOC_ID=e969b3b0-fc44-f2e7-bed9-9a380c13d226
FLY_APP_NAME=debug
FLY_PUBLIC_IP=2605:4c40:243:8a59:0:e969:b3b0:1
FLY_REGION=mia
FLY_VM_MEMORY_MB=128
GPG_KEY=A035C8C19219BA821ECEA86B64E628F8D684696D
HOME=/root
LANG=C.UTF-8
PATH=/usr/local/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
PYTHON_GET_PIP_SHA256=01249aa3e58ffb3e1686b7141b4e9aac4d398ef4ac3012ed9dff8dd9f685ffe0
PYTHON_GET_PIP_URL=https://github.com/pypa/get-pip/raw/d781367b97acf0ece7e9e304bf281e99b618bf10/public/get-pip.py
PYTHON_PIP_VERSION=21.2.4
PYTHON_SETUPTOOLS_VERSION=57.5.0
PYTHON_VERSION=3.10.0
TERM=linux
WS=this
is
a
test
cgroup_enable=memory

2022-07-08 14:22:42.488083237 +0000 UTC m=+1659318.267707183

Can you provide a traceroute to debug.fly.dev from both computers?

This is very weird. If your dig works, this should resolve. What if you run it like this?

curl https://debug.fly.dev --resolve debug.fly.dev:443:77.83.140.164

From problematic machine

traceroute

$ traceroute debug.fly.dev
traceroute: unknown host debug.fly.dev

curl

$ curl https://debug.fly.dev --resolve debug.fly.dev:443:77.83.140.164
=== Headers ===
Host: debug.fly.dev
User-Agent: curl/7.77.0
Fly-Forwarded-Ssl: on
Via: 2 fly.io
Fly-Request-Id: 01G7F5AX78NXKVQMRXJB2E86SM-mia
Accept: */*
Fly-Forwarded-Proto: https
X-Forwarded-Proto: https
X-Forwarded-Ssl: on
Fly-Forwarded-Port: 443
X-Forwarded-Port: 443
X-Forwarded-For: 47.200.195.8, 77.83.140.164
Fly-Region: mia
X-Request-Start: t=1657292551400352
Fly-Client-Ip: 47.200.195.8

=== ENV ===
FLY_ALLOC_ID=c1c01bfc-a58e-6820-8905-be040261b893
FLY_APP_NAME=debug
FLY_PUBLIC_IP=2605:4c40:243:16df:0:c1c0:1bfc:1
FLY_REGION=mia
FLY_VM_MEMORY_MB=128
GPG_KEY=A035C8C19219BA821ECEA86B64E628F8D684696D
HOME=/root
LANG=C.UTF-8
PATH=/usr/local/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
PYTHON_GET_PIP_SHA256=01249aa3e58ffb3e1686b7141b4e9aac4d398ef4ac3012ed9dff8dd9f685ffe0
PYTHON_GET_PIP_URL=https://github.com/pypa/get-pip/raw/d781367b97acf0ece7e9e304bf281e99b618bf10/public/get-pip.py
PYTHON_PIP_VERSION=21.2.4
PYTHON_SETUPTOOLS_VERSION=57.5.0
PYTHON_VERSION=3.10.0
TERM=linux
WS=this
is
a
test
cgroup_enable=memory

2022-07-08 15:02:31.408433998 +0000 UTC m=+1661761.636986083

From working machine

traceroute

% traceroute debug.fly.dev
traceroute to debug.fly.dev (77.83.140.164), 64 hops max, 52 byte packets
 1  192.168.4.1 (192.168.4.1)  2.804 ms  2.146 ms  2.106 ms
 2  * * *
 3  172.99.47.16 (172.99.47.16)  7.830 ms
    172.99.44.38 (172.99.44.38)  6.275 ms
    172.99.47.18 (172.99.47.18)  5.301 ms
 4  * ae8---0.scr02.mias.fl.frontiernet.net (74.40.3.73)  12.667 ms  12.352 ms
 5  ae1---0.cbr05.mias.fl.frontiernet.net (45.52.201.155)  10.155 ms
    ae0---0.cbr05.mias.fl.frontiernet.net (45.52.201.153)  11.338 ms
    ae1---0.cbr05.mias.fl.frontiernet.net (45.52.201.155)  11.101 ms
 6  * * *
 7  * * *
 8  * * *

curl

% curl https://debug.fly.dev --resolve debug.fly.dev:443:77.83.140.164
=== Headers ===
Host: debug.fly.dev
X-Forwarded-For: 47.200.195.8, 77.83.140.164
Fly-Forwarded-Proto: https
X-Forwarded-Port: 443
Fly-Request-Id: 01G7F5FXBAQDNGACBA8Q580P4N-mia
Accept: */*
Fly-Client-Ip: 47.200.195.8
X-Forwarded-Ssl: on
Fly-Region: mia
X-Forwarded-Proto: https
Fly-Forwarded-Ssl: on
Via: 2 fly.io
User-Agent: curl/7.79.1
X-Request-Start: t=1657292715370593
Fly-Forwarded-Port: 443

=== ENV ===
FLY_ALLOC_ID=e969b3b0-fc44-f2e7-bed9-9a380c13d226
FLY_APP_NAME=debug
FLY_PUBLIC_IP=2605:4c40:243:8a59:0:e969:b3b0:1
FLY_REGION=mia
FLY_VM_MEMORY_MB=128
GPG_KEY=A035C8C19219BA821ECEA86B64E628F8D684696D
HOME=/root
LANG=C.UTF-8
PATH=/usr/local/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
PYTHON_GET_PIP_SHA256=01249aa3e58ffb3e1686b7141b4e9aac4d398ef4ac3012ed9dff8dd9f685ffe0
PYTHON_GET_PIP_URL=https://github.com/pypa/get-pip/raw/d781367b97acf0ece7e9e304bf281e99b618bf10/public/get-pip.py
PYTHON_PIP_VERSION=21.2.4
PYTHON_SETUPTOOLS_VERSION=57.5.0
PYTHON_VERSION=3.10.0
TERM=linux
WS=this
is
a
test
cgroup_enable=memory

2022-07-08 15:05:15.374159483 +0000 UTC m=+1661871.077660492

This looks like a DNS issue resolving any .fly.dev hostname from the problematic machine.

Our DNS setup for <app-name>.fly.dev isn’t special in any way.

What does this return on the problematic computer?

curl -v --dns-servers 8.8.8.8 https://debug.fly.dev

What does cat /etc/resolv.conf return on both machines?

1 Like

Problematic machine

$ curl -v --dns-servers 8.8.8.8 https://debug.fly.dev
* Could not resolve host: debug.fly.dev
* Closing connection 0
curl: (6) Could not resolve host: debug.fly.dev
$ cat /etc/resolv.conf
#
# macOS Notice
#
# This file is not consulted for DNS hostname resolution, address
# resolution, or the DNS query routing mechanism used by most
# processes on this system.
#
# To view the DNS configuration used by this system, use:
#   scutil --dns
#
# SEE ALSO
#   dns-sd(1), scutil(8)
#
# This file is automatically generated.
#
nameserver 192.168.4.1

Working machine (different network now)

% cat /etc/resolv.conf
#
# macOS Notice
#
# This file is not consulted for DNS hostname resolution, address
# resolution, or the DNS query routing mechanism used by most
# processes on this system.
#
# To view the DNS configuration used by this system, use:
#   scutil --dns
#
# SEE ALSO
#   dns-sd(1), scutil(8)
#
# This file is automatically generated.
#
nameserver fe80::f1:4fff:feab:4fe4%en0
nameserver 192.168.1.1

Thanks.

This definitely looks like a DNS issue.

Does the problematic machine use something like Pow? I found this: dns - Why all *.dev domains target to my localhost? - Stack Overflow

2 Likes

Geez wow you’re right, it was Pow. :skull: Incredible fact-finding there.

I used a 2013 MacBook Pro, then handed it down to my wife. Later on, I bought her a new computer and I transferred her profile from the 2013 computer. Even though I had Pow in my own profile, it was probably installed system-wide and transferred over with it.

Thanks for bearing with me here. Please send a sizable bill to the ~maintainers~ of Pow. :pray:

3 Likes