So I think this might have something to do with local network issues, which I find quite odd as most things are working fine, with the exception of docker. I landed up deploying via CI, and all went smoothly. I do notice that the app is a touch slow (hosted in the jnb region),
@wjordan – thanks, I thought the issue was resolved (per the status page). I guess this explains the slowness of the app itself, but I don’t think it explains the deployment issues I’ve been having, at least I don’t have a reason to think so, especially considering it deploys from CI without issue. I wonder if my fibre provider is also being impacted by the undersea cable breaks – routings on this end could be causing issues, perhaps? The line has gone down several times in the past 48 hours, but has been up and running for the most part.
On our network, we haven’t seen significant ping loss to jnb (above ~10%) since the incident was resolved, and no ping loss between jnb and iad (where the fly.io dashboard and parts of our deployment-monitoring control plane is hosted).
If some apt operations are timing out on your local docker build, that suggests your local network ISP is having trouble connecting to other remote locations.
It’s possible that your current wireguard peer is located in a region that can’t connect to your local network due to ISP issues. You might have some success creating a new wireguard peer in a different region. (We don’t currently have a wireguard gateway in jnb, but perhaps you might have better luck some other region if it routes differently.)
Thanks for the details – it’s starting to make some sense. ISP issues seems strange, as the line is mostly fine – it must just be connections over specific routes.
I tried resetting wireguard (I believe it is using maa), but couldn’t do that either. My intention was to wipe out all the wireguards on all orgs and start afresh. It would be nice if there was a way to do a full reset, removing everything (unless that’s what the reset command does – I can’t tell though, because it doesn’t work).
Given than I can deploy from CI, I’m going to reduce my stress levels over this and use that for now.
I’ll do a deploy from my mac from time to time to see what happens – hopefully things improve in the coming days/weeks.
I checked some more metrics and it does look like maa is one of the 5-10% of regions where we’ve been seeing some ongoing intermittent connectivity loss to jnb since the cable cut. I wouldn’t be surprised if your own local ISP was seeing similar connection issues to the maa gateway, so switching your wireguard peer to a different region could help.
fly wireguard reset is the command to connect your local agent to a new wireguard peer, but it defaults to the nearest region. To override this, I just learned that you can restart the agent with the (undocumented) FLYCTL_WG_REGION environment variable to override the region when resetting the wireguard peer:
Awesome, thanks – seems to be able to create a new one. VPN also helps to force it.
How do I remove the MAA ones (there are a few handfuls of them)? When I try, I get “upstream service is unavailable”. Or should I not worry as it won’t be used?
fly wireguard remove is the command, but maybe it won’t work if you’re unable to connect to that region’s gateway. In that case, I wouldn’t worry about it since it won’t be used once your agent is connecting to the new one.