This might have been mentioned before but it’s reaaaally annoying
This I think is also why fly ssh console
itself is often not working, because it’s trying to connect to 1 of those from that list, which can be a dead VM.
This might have been mentioned before but it’s reaaaally annoying
This I think is also why fly ssh console
itself is often not working, because it’s trying to connect to 1 of those from that list, which can be a dead VM.
I can imagine how that would get in the way, thanks for bringing it up.
Do you see the dead VMs when you run fly status -a staxcloud-staging
or if you dig vms.staxcloud-staging.internal @fdaa:0:8efd::3
from your org’s 6PN?
Do you see the dead VMs when you run
fly status -a staxcloud-staging
Nope. That’s what I use now: I run fly status
. Pick one of the 3 IDs and then match that to the instances that show up in fly ssh console -s
Ah my mistake, I think you’d want to dig txt
here instead
You’ll need both TXT and @fdaa:0:33::3
(different IP) or: fly dig -a staxcloud-staging TXT staxcloud-staging.internal
Oh sorry:
fly dig -a staxcloud-staging TXT vms.staxcloud-staging.internal
It seems fine. Let me see if I can reproduce it. fly ssh console -s
is not showing dead VMs right now
Yes, that seems correct, I expect the list with fly ssh console -s -a staxcloud-staging
is also correct now?
When there’s a deploy, there’s a short period where, due to the nature of distributed systems, we’re reconciling state and it’s possible for old VMs to show up.
Yea alright, I understand the technical difficulty. It would be nice if it wouldn’t be a problem for us though
I have a few ideas I can try.
there’s a short period where, due to the nature of distributed systems
It’s not really a short period though. I don’t know what you consider short, but I am comparing it to read replica’s catching up to write replica’s. The dead VMs show up for up to 5 minutes sometimes in fly ssh console -s
A few seconds is what I’d expect.
I dug further and, based on your other post, it looks like you’re connecting to one of our “backup” gateways which is using an older, slower-to-replicate, version of our DNS server.
Can you try fly wireguard reset
for the same org that app you’re playing with is in? This will get you a new wireguard peer, in a primary region.
I’m going to be investigating how you got a peer on that specific gateway so it doesn’t happen again.
I understood half of what you said, but I ran the fly wireguard reset
for the organisation Thanks for looking into it.
Yep, took like 2 secs today for (machine) vm list to get up to date, for me.
And this has been the case for quite sometime now: Does stopped VMs incur costs? - #3 by ignoramous