Postgres machines down?

barronwebster · February 8, 2023, 4:28pm

DB has been working fine for a few weeks but I didn’t change anything and now can’t connect to it from my app. Last I checked it wasn’t close to full. I can’t restart it or see the config with the fly CLI.

fly pg config -a mayorgame-db show
gives me:

command is not compatible with this image

fly pg restart --app <name>
gives me:

Error failed to obtain lease: failed to get lease on VM 73287903f11685: dial tcp [fc01:a7b:92::]:3593: i/o timeout

Metrics aren’t showing up in my logs either:

kurt · February 8, 2023, 4:35pm

This database is on a host that had a hardware failure. We’re working to restore it. We’re also working on ways to communicate hardware failures more aggressively, because there’s no way you would have known! It is statistically likely we’ll have hardware fail most days, so we don’t update our global status page per host anymore, but you should still be able to figure this out.

When it comes back, I would recommend adding a second Postgres node to your cluster. This is the best way to ensure you’re resilient to hardware issues.

barronwebster · February 8, 2023, 4:49pm

Thanks kurt! How would one do this?

kurt · February 8, 2023, 4:51pm

This should do it: High Availability & Global Replication · Fly Docs

alexmacarthur · April 12, 2024, 4:48am

I’m experiencing this issue now. It’s been two days. Do we know when it’ll be addressed?

Topic		Replies	Views
Postgres database down... again.	2	424	April 21, 2023
Postgres app unresponsive, timing out	3	280	June 13, 2023
pg db unreachable, unable to restart or scale postgres	2	131	May 15, 2024
Postgres restart timing out postgres	2	242	June 5, 2023
Postgres database down, can't restart instance or machine Questions / Help postgres	10	1138	October 15, 2023

Postgres machines down?

Related topics