Postgres database primary region node (failed to connect) pg check failing

grilla · December 18, 2022, 2:46pm

How my problem was resolved.

The trick was to look at the volumes (which I did):

> fly volumes list -a my-postgres-cluster

```table
ID          STATE   NAM SIZE   REGION  ZONE    ENCRYPTED       ATTACHED VM     CREATED AT   
vol_****    created *** xGB    mia     **    true           *****        2 months ago
vol_****    created *** xGB    dfw     **    true            *****        3 days ago  
vol_****    created *** xGB    ewr     **    true            *****        1 week ago  
vol_****    created *** xGB    mia     **    true                   2 months ago
vol_****    created *** xGB    lax     **    true            *****        2 months ago

and to realize that one of the volumes was not attached to a VM.

Even though I saw that a volume was unattached I didn’t think to simply allow an instance to exist for that volume. It seems obvious now, right?

So to fix it, I had to change my scale. Originally, I was using this:

fly scale count 4 --max-per-region=1 -a my-postgres-cluster

and I switched it to this:

fly scale count 5 --max-per-region=2 -a my-postgres-cluster

This allowed the duplicate-region volume to be accounted for. A new cluster mia instance was created alongside the existing mia instance. The cluster could then heal itself.

I had to email fly support to get this understanding. Lessons learned the hard way.

PS the reason I switched RAM rapidly was because I misread a graph on Grafana. I thought RAM was full but actually used RAM was close to 0 and the graph was showing total RAM. Another lesson learned, the very hard way.

Topic		Replies	Views
PostgreSQL Database in Failing State Questions / Help postgres	4	748	July 18, 2022
PosgreSQL on Fly: 1 critical health check	10	632	December 20, 2021
Postgres cluster broken since last Fly migration Questions / Help postgres	1	131	July 2, 2024
No eligible masters and db unavailable after moving primary region and running pg-failover	7	351	October 3, 2022
Postgres health checks perpetually failing Questions / Help postgres	3	1084	March 2, 2023

Postgres database primary region node (failed to connect) pg check failing

Related topics