Can't connect to or restart database

My database doesn’t seem reachable. When opening a ruby console on a machine, the connection is hanging and I get a timeout. I have the following error in the logs:

2023-11-12T16:32:56.228 app[3d8dd1ef952e89] ams [info] sentinel | 2023-11-12T16:32:56.227Z WARN cmd/sentinel.go:276 no keeper info available {"db": "b5248d64", "keeper": "23c50c3642"}
2023-11-12T16:34:37.143 app[3d8dd1ef952e89] ams [info] sentinel | 2023-11-12T16:34:37.142Z WARN cmd/sentinel.go:276 no keeper info available {"db": "b5248d64", "keeper": "23c50c3642"}
2023-11-12T16:36:08.441 app[3d8dd1ef952e89] ams [info] sentinel | 2023-11-12T16:36:08.440Z WARN cmd/sentinel.go:276 no keeper info available {"db": "b5248d64", "keeper": "23c50c3642"}
2023-11-13T09:00:33.710 app[3d8dd1ef952e89] ams [info] proxy | [WARNING] 316/090033 (562) : Server bk_db/pg1 is going DOWN for maintenance (DNS timeout status). 0 active and 1 backup servers left. Running on backup. 1 sessions active, 0 requeued, 0 remaining in queue.
2023-11-13T09:01:03.740 app[3d8dd1ef952e89] ams [info] proxy | [WARNING] 316/090103 (562) : Server bk_db/pg1 ('ams.sportbrook-db.internal') is UP/READY (resolves again).
2023-11-13T09:01:03.740 app[3d8dd1ef952e89] ams [info] proxy | [WARNING] 316/090103 (562) : Server bk_db/pg1 administratively READY thanks to valid DNS answer.

At the same time, when I try to restart I get the following errors:

flyctl machine restart 3d8dd1ef952e89 -a sportbrook-db
Error: could not get machine 3d8dd1ef952e89: failed to get VM 3d8dd1ef952e89: request returned non-2xx status, 504 (Request ID: 01HF435BFS0G26KRPQ66TRQTRN-fra)
fly pg restart -a sportbrook-db
Error: failed to obtain lease: failed to get lease on VM 3d8dd1ef952e89: request returned non-2xx status, 504 (Request ID: 01HF43NT5QKBDZVD8XNTSFPARF-fra)

I see nothing on the statuspage, what am I doing wrong?

Hi @tim-ror

We’re performing emergency maintenance on the host your database app is on. You should be able to see a status notification in your dashboard now.

Unfortunately, there’s always a risk of downtime when running a single machine/volume. When the issue is resolved, you might need to restart your postgres cluster. Let us know if you have any further trouble post-resolution.

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.