Postgres server down

I have a pg app that seems to be down for since perhaps an hour. When I try looking into the monitoring tab, I see the following error:

2023-01-09T22:52:43.511 app[4d896d2f291287] ord [info] sentinel | 2023-01-09T22:52:43.511Z WARN cmd/sentinel.go:276 no keeper info available {"db": "c82203b5", "keeper": "9adae6573ca02"}

2023-01-09T22:52:54.932 app[4d896d2f291287] ord [info] keeper | 2023-01-09T22:52:54.930Z ERROR cmd/keeper.go:1041 error retrieving cluster data {"error": "Unexpected response code: 500"}

2023-01-09T23:00:12.672 app[4d896d2f291287] ord [info] sentinel | 2023-01-09T23:00:12.671Z WARN cmd/sentinel.go:276 no keeper info available {"db": "c82203b5", "keeper": "9adae6573ca02"}

2023-01-10T02:58:16.042 app[4d896d2f291287] ord [info] keeper | 2023-01-10 02:58:16.038 UTC [2118] LOG: PID 25459 in cancel request did not match any process

2023-01-10T03:04:29.086 app[4d896d2f291287] ord [info] keeper | 2023-01-10 03:04:29.082 UTC [4221] LOG: PID 2590 in cancel request did not match any process

2023-01-10T03:04:49.330 app[4d896d2f291287] ord [info] keeper | 2023-01-10 03:04:49.324 UTC [4338] LOG: PID 2953 in cancel request did not match any process

2023-01-10T03:09:29.701 app[4d896d2f291287] ord [info] keeper | 2023-01-10 03:09:29.697 UTC [5909] LOG: PID 2156 in cancel request did not match any process

2023-01-10T03:09:46.979 app[4d896d2f291287] ord [info] keeper | 2023-01-10 03:09:46.976 UTC [6006] LOG: PID 3216 in cancel request did not match any process

2023-01-10T03:15:30.718 app[4d896d2f291287] ord [info] keeper | 2023-01-10 03:15:30.714 UTC [7932] LOG: PID 3538 in cancel request did not match any process

2023-01-10T03:15:49.182 app[4d896d2f291287] ord [info] keeper | 2023-01-10 03:15:49.177 UTC [8043] LOG: PID 3405 in cancel request did not match any process

I am trying to restart the pg app, but that too seems to have been stuck and eventually fails:

fly pg restart -a <app-name>
Identifying cluster role(s)
  Machine 4d896d2f291287: leader
Restarting machine 4d896d2f291287
Error could not stop machine 4d896d2f291287: failed to restart VM 4d896d2f291287: Post "http://[fdaa:1:459::3]:4280/v1/apps/<app-name>/machines/4d896d2f291287/restart?force_stop=false": EOF

An update: After restarting my server app (that uses this database), the connectivity between the app and the database seems restored, but all queries are super slow.

1 Like