flyctl crashes network on M1

I’m not sure what the hell is happening with the CLI but as soon as I do anything with flyctl that requires network, after 5-10 seconds it crashes completely my network, and by that I mean that I lose completely connectivity, I can’t open the network tab and after a couple of minutes it starts randomly with the spinning wheel of death.

$ flyctl version
flyctl v0.0.310 darwin/arm64 Commit: 51d0e48 BuildDate: 2022-03-29T11:51:46Z

Video of what’s happening: bug flyctl - YouTube

I have Monterey 12.3 and last time I used flyctl was 9 hours ago and it worked fine, not sure if it automatically updated itself or what could have possibly happened.
I also manually removed it and reinstalled with homebrew.

I live in London and I have IPv6 connectivty so maybe it’s something related to LHR having issues? Still, I don’t understand how the CLI can completely crash the network stack of the OS unless it’s a serious software bug.

Any idea?

1 Like

Wow. That is an incredible bug.

If you get it to happen again, can you try running flyctl agent stop in another terminal? Or ps aux | grep flyctl and see what all’s running?

Are you running a VPN service?

1 Like

I don’t have any VPN, I’m just using Little Snitch as a firewall.
I’m currently in bed so using wi-fi but before I was at the desk and it was happening on ethernet as well.
As mentioned I have IPv6 connectivity, I live in London and I saw you are having issues in LHR datacenter with IPv6 so my very wild guess is that there is something to do either with OS X Monterey or with Golang opening of TCP connections with IPv6.
I can’t run flyctl agent stop as I’m not logged in and I can’t login anymore :laughing:
New video: bug flyctl 2 - YouTube

Going to bed now, happy to provide more info tomorrow!

I don’t think it was related to the IPv6 outage. That didn’t affect anything your local system talks to, just apps running on the platform.

Is it possible to disable Little Snitch and see if it still behaves this way? flyctl doesn’t have enough permissions to mess with your network (our life would be easier if it did). Little Snitch does, though, and it’s possible our weird UDP traffic is conflicting with it somehow.

I tried to disable the network filtering by Little Snitch but the problem is still there.
I also noticed that if I have the iPhone connected with the USB cable since it uses the same localhost it also affects the iPhone as soon as I open Safari I can’t browse anywhere and it’s stuck :joy:

A colleague suggested there might be something to do with routing table?

Anyway, is there another way to deploy an app other than using the CLI?

Okay now all of a sudden it started working again… the only thing I did was manually set the IPv6 DNS, while before I only had IPv4 (1.1.1.1 and 1.0.0.1) no idea if it’s related but it’s the only change I’ve made…

flyctl doesn’t have permission to touch the routing table. It actually doesn’t have permission to do anything with the network!

It’s aggressive, but you might try uninstalling Little Snitch entirely to see what happens. There are a bunch of vaguely similar sounding issues people have had, here’s an example: Internet Access Not Working after … | Apple Developer Forums

You can deploy without using our wireguard agent. You can either:

  1. Build your docker image locally (make sure you build an x86 version on the mac), run flyctl auth docker, push to registry.fly.io/<app>:tag, then run fly deploy -i registry.fly.io/<app>:tag
  2. Setup system wireguard, then run FLY_REMOTE_BUILDER_HOST_WG=1 fly deploy
1 Like

I replied a minute before you posted, I managed to make it work, at least it seems stable right now I was able to login flyctl auth login and deploy a project flyctl deploy and tried a few times flyctl ping

$ flyctl ping
35 bytes from fdaa:0:3bc5::3 (gateway), seq=0 time=4.1ms
35 bytes from fdaa:0:3bc5::3 (gateway), seq=1 time=4ms
35 bytes from fdaa:0:3bc5::3 (gateway), seq=2 time=3.9ms
35 bytes from fdaa:0:3bc5::3 (gateway), seq=3 time=4.3ms
35 bytes from fdaa:0:3bc5::3 (gateway), seq=4 time=3.9ms

I think it has something to do with the IPv6 DNS resolver at this point :grimacing:

There is almost no chance it’s the IPv6 resolver. The ping 1.1.1.1 errors don’t go through DNS. If that stops working, the whole network stack is f’d.

flyctl ping uses our same agent process as the build, for what it’s worth. My guess is that little snitch isn’t happy with the UDP traffic between our agent and our wireguard gateway. It’s kind of weird network traffic that didn’t exist when Little Snitch was created.

Yes I think the entire network stack implodes at that point but it could possibly be a bug in OS X that if there is a software asking the system to use the IPv6 DNS resolver and that’s not set, somehow hangs…

Otherwise why did it start working all of a sudden? I don’t believe in magic :laughing:
Nothing apart from the IPv6 DNS settings has changed :thinking:

Anyway now it works, fingers crossed it will stay that way :smile:

Thank you, I appreciate your help and suggestions :smile:

We’ve found out a little more information. If you uninstall Little Snitch these problems will go away. Disabling it doesn’t work.

The working theory is that Monterey’s new Network Extension system is painful. Multiple network extensions can conflict. Are you running Tailscale or Wireguard or something else that might be installing a network extension?

It’s unclear why traffic from flyctl triggers this, but the root problem seems to be Little Snitch, possibly in combination with another system level network extension.

That’s interesting but how do you explain the fact that now it works fine for me?

Just to be 100% clear: I have Little Snitch installed and enabled and it works fine for me right now, I’ve just deployed again an app successfully.

The only change I’ve made was to add the IPv6 DNS that were missing :eyes: