I haven’t been able to SSH into my app instance for half a day now. What’s up with that?
It just hangs like this:
➜ levadia.gg git:(master) fly ssh console
Connecting to top1.nearest.of.levadia-gg.internal... complete
Sometimes I even get a timeout like this:
Error error connecting to SSH server: connect tcp [fdaa:0:5aea:a7b:23c6:b368:dfc6:2]:22: operation timed out
Things I’ve tried:
- Restarting the agent multiple times.
- Clearing the wire_guard_state in my ~/.fly/config.yml
- Executing via
fly ssh console -s
- Re-deploying and restarting the app.
When I created the app ~6 hours ago, I did destroy the previous instance with the same name (it was an empty instance with no public IP-s, I was unable to deploy to it due to some outage and I read that re-creating the app instance did help some, so I tried that). However, I was able to SSH into the current instance ~4 hours ago. It stopped working some time after that.
I’m having a similar issue. SSH access today seems very flaky.
We had a couple different issues today. We’re in the process of moving from Consul to an internal service discovery system called
Corrosion; the first place
Corrosion is getting rolled out is DNS for our WireGuard gateways. Some of our gateways were very small, relative to the rest of our hosts, and
Corrosion’s DNS server overwhelmed them; we’ve done a fleetwide upgrade of our gateways.
If you were seeing pretty much everything work, to the point where you were getting an
fdaa:: address for the instance you were connecting to in your output, then what was probably happening is that you were getting a stale IPv6 address from
Nice. Just in time when I’m contemplating using
6pn for some experiments. Only thing that kept me off of it were numerous DNS mis-reconciliation reports (even though my use for it isn’t for anything critical).
As someone who followed Route53’s design evolution with a keen interest while employed at AWS, I am uber interested in learning more about Fly’s design of this new system. Hope it makes it to the Fly blog (: (don’t leave us hanging…).
Corrosion will extremely definitely be a blog post. It’s Jerome’s story to tell, though, not mine.