I’ve been having issues connecting to the postgres app since yesterday. I hadn’t made any changes or deploys recently, until after when I was trying to fix it.
The logs for the postgres instance show. These repeat.
2022-04-20T22:05:48Z app[f8a8b0aa] sjc [info]checking stolon status
2022-04-20T22:05:48Z app[f8a8b0aa] sjc [info]sentinel | 2022-04-20T22:05:48.903Z WARN cmd/sentinel.go:276 no keeper info available {"db": "c37d4d88", "keeper": "22950bdbf2"}
2022-04-20T22:05:48Z app[f8a8b0aa] sjc [info]sentinel | 2022-04-20T22:05:48.903Z WARN cmd/sentinel.go:276 no keeper info available {"db": "f939a1f5", "keeper": "ad10bdc02"}
2022-04-20T22:05:48Z app[f8a8b0aa] sjc [info]sentinel | 2022-04-20T22:05:48.906Z INFO cmd/sentinel.go:995 master db is failed {"db": "c37d4d88", "keeper": "22950bdbf2"}
2022-04-20T22:05:48Z app[f8a8b0aa] sjc [info]sentinel | 2022-04-20T22:05:48.906Z INFO cmd/sentinel.go:1001 db not converged {"db": "c37d4d88", "keeper": "22950bdbf2"}
2022-04-20T22:05:48Z app[f8a8b0aa] sjc [info]sentinel | 2022-04-20T22:05:48.906Z INFO cmd/sentinel.go:1006 trying to find a new master to replace failed master
2022-04-20T22:05:48Z app[f8a8b0aa] sjc [info]sentinel | 2022-04-20T22:05:48.906Z ERROR cmd/sentinel.go:1009 no eligible masters
Running fly checks list
returns
Health Checks for lolsite-db
NAME | STATUS | ALLOCATION | REGION | TYPE | LAST UPDATED | OUTPUT
-------*----------*------------*--------*------*--------------*-------------------------------------------------------------------------------------------------------------------------------------
vm | passing | f8a8b0aa | sjc | HTTP | 4m6s ago | HTTP GET http://172.19.8.186:5500/flycheck/vm: 200 OK Output: "[✓]
| | | | | |
| | | | | | checkDisk: 92.86 GB (94.8%) free space on /data/ (32.28µs)\n[✓]
| | | | | |
| | | | | | checkLoad: load averages: 0.01 0.04 0.02 (49.5µs)\n[✓]
| | | | | |
| | | | | | memory: system spent 0s of the last 60s waiting on memory (21.48µs)\n[✓]
| | | | | |
| | | | | | cpu: system spent 522ms of the last 60s waiting on cpu (16.73µs)\n[✓]
| | | | | |
| | | | | | io: system spent 0s of the last 60s waiting on io (13.21µs)"[✓]
| | | | | |
| | | | | |
role | critical | f8a8b0aa | sjc | HTTP | 17m38s ago | failed to connect to local node: context deadline exceeded[✓]
| | | | | |
| | | | | |
pg | critical | f8a8b0aa | sjc | HTTP | 17m46s ago | HTTP GET http://172.19.8.186:5500/flycheck/pg: 500 Internal Server Error Output: "failed to connect to proxy: context deadline exceeded"[✓]
The app I have connected to this database shows this connection error.
2022-04-20T08:08:10Z app[e1905f1f] dfw [info]django.db.utils.OperationalError: connection to server at "lolsite-db.internal" (fdaa:0:3161:a7b:2295:0:d0e5:2), port 5432 failed: server closed the connection unexpectedly
2022-04-20T08:08:10Z app[e1905f1f] dfw [info] This probably means the server terminated abnormally
fly status --all
App
Name = lolsite-db
Owner = personal
Version = 8
Status = running
Hostname = lolsite-db.fly.dev
Instances
ID PROCESS VERSION REGION DESIRED STATUS HEALTH CHECKS RESTARTS CREATED
f8a8b0aa app 8 sjc run running (failed to co) 3 total, 1 passing, 2 critical 1 7h22m ago
645b3548 app 8 sjc stop complete 3 total, 1 passing, 2 critical 0 14h23m ago
6a7a50b5 app 8 sjc stop failed 0 14h24m ago
I saw a similar post which seemed to have been resolved by increasing the volume size. I didn’t think that was my issue but I tried it anyway with no luck.