Ive had a database connected to my app for months, but now suddenly the app is unable to connect. I also tried proxying to it with the flyctl, but it cant find the host name.
My dashboard is green, and so is the fly status pages. So I am a bit lost. My app has been unchanged for 1 month, and has been working with no problems until I discovered this today.
Looks like this was on a host that crashed and was restarted on the 9th. It started back up at 4:32 PM and your machine was restarted a few minutes after that.
Seems like there was a bug and the private IP addresses were not properly registered in our state.
I stop/started your machine and it is back and DNS responses seem . This enabled the app connecting to the database to start back up too.
We’re going to look through what might’ve happened and implement a more permanent fix.
Great, thanks! Any way I could have detected this? Either automatically (from the app) or manually? And how could I have fixed it myself (automatically)?
We don’t yet have alerting features like that, though we are working on something.
You couldn’t have detected this automatically or manually by directly observing your database instances. Indirectly, your app connecting to the database on boot didn’t work. External monitoring services might’ve alerted you about this earlier (I’m just throwing ideas, not saying you should’ve had that).
We’re looking into the sequence of events that led to this situation. It seemed correlated with the host rebooting.
Restarting the database instance (the machine, via fly machine restart, not via fly pg restart) would’ve fixed this. That’s how I fixed it. We have a better way to fix it without a restart now in case it happens again. Ideally it won’t, that’s what we’re working towards.