PG issues `backend 'bk_db' has no server available!`, FRA region

App started having issues today, had no recent deployments.

2022-10-23T09:45:29.533 app[8e654884] fra [info] proxy | [ALERT] 295/094529 (579) : backend 'bk_db' has no server available!
2022-10-23T09:45:35.203 app[8e654884] fra [info] sentinel | 2022-10-23T09:45:35.201Z WARN cmd/sentinel.go:276 no keeper info available {"db": "41ea5c98", "keeper": "23c312b842"}
2022-10-23T09:45:35.203 app[8e654884] fra [info] sentinel | 2022-10-23T09:45:35.203Z ERROR cmd/sentinel.go:1009 no eligible masters
2022-10-23T09:45:43.125 app[8e654884] fra [info] sentinel | 2022-10-23T09:45:43.124Z WARN cmd/sentinel.go:276 no keeper info available {"db": "41ea5c98", "keeper": "23c312b842"}
2022-10-23T09:45:43.131 app[8e654884] fra [info] sentinel | 2022-10-23T09:45:43.130Z ERROR cmd/sentinel.go:1009 no eligible masters
2022-10-23T09:45:47.578 app[8e654884] fra [info] proxy | [WARNING] 295/094547 (579) : Server bk_db/pg1 is UP, reason: Layer7 check passed, code: 200, check duration: 13ms. 1 active and 0 backup servers online. 0 sessions requeued, 0 total in queue.
2022-10-23T09:45:53.168 app[8e654884] fra [info] sentinel | 2022-10-23T09:45:53.167Z WARN cmd/sentinel.go:276 no keeper info available {"db": "41ea5c98", "keeper": "23c312b842"}
2022-10-23T09:45:53.172 app[8e654884] fra [info] sentinel | 2022-10-23T09:45:53.169Z ERROR cmd/sentinel.go:1009 no eligible masters
2022-10-23T09:46:03.671 app[8e654884] fra [info] sentinel | 2022-10-23T09:46:03.668Z WARN cmd/sentinel.go:276 no keeper info available {"db": "41ea5c98", "keeper": "23c312b842"}
2022-10-23T09:46:03.672 app[8e654884] fra [info] sentinel | 2022-10-23T09:46:03.670Z ERROR cmd/sentinel.go:1009 no eligible masters
2022-10-23T09:46:08.584 app[8e654884] fra [info] proxy | [WARNING] 295/094608 (579) : Server bk_db/pg1 is DOWN, reason: Layer7 timeout, check duration: 5000ms. 0 active and 0 backup servers left. 5 sessions active, 0 requeued, 0 remaining in queue.
2022-10-23T09:46:08.584 app[8e654884] fra [info] proxy | [ALERT] 295/094608 (579) : backend 'bk_db' has no server available!
2022-10-23T09:46:15.138 app[8e654884] fra [info] sentinel | 2022-10-23T09:46:15.137Z WARN cmd/sentinel.go:276 no keeper info available {"db": "41ea5c98", "keeper": "23c312b842"}
2022-10-23T09:46:15.139 app[8e654884] fra [info] sentinel | 2022-10-23T09:46:15.139Z ERROR cmd/sentinel.go:1009 no eligible masters
2022-10-23T09:46:21.374 app[8e654884] fra [info] sentinel | 2022-10-23T09:46:21.373Z WARN cmd/sentinel.go:276 no keeper info available {"db": "41ea5c98", "keeper": "23c312b842"}
2022-10-23T09:46:21.378 app[8e654884] fra [info] sentinel | 2022-10-23T09:46:21.377Z ERROR cmd/sentinel.go:1009 no eligible masters
2022-10-23T09:46:26.622 app[8e654884] fra [info] proxy | [WARNING] 295/094626 (579) : Server bk_db/pg1 is UP, reason: Layer7 check passed, code: 200, check duration: 12ms. 1 active and 0 backup servers online. 0 sessions requeued, 0 total in queue.
2022-10-23T09:46:31.789 app[8e654884] fra [info] sentinel | 2022-10-23T09:46:31.788Z WARN cmd/sentinel.go:276 no keeper info available {"db": "41ea5c98", "keeper": "23c312b842"}
2022-10-23T09:46:31.791 app[8e654884] fra [info] sentinel | 2022-10-23T09:46:31.790Z ERROR cmd/sentinel.go:1009 no eligible masters
2022-10-23T09:46:43.330 app[8e654884] fra [info] sentinel | 2022-10-23T09:46:43.330Z WARN cmd/sentinel.go:276 no keeper info available {"db": "41ea5c98", "keeper": "23c312b842"}
2022-10-23T09:46:43.331 app[8e654884] fra [info] sentinel | 2022-10-23T09:46:43.331Z ERROR cmd/sentinel.go:1009 no eligible masters
2022-10-23T09:46:47.633 app[8e654884] fra [info] proxy | [WARNING] 295/094647 (579) : Server bk_db/pg1 is DOWN, reason: Layer7 timeout, check duration: 5001ms. 0 active and 0 backup servers left. 5 sessions active, 0 requeued, 0 remaining in queue.
2022-10-23T09:46:47.633 app[8e654884] fra [info] proxy | [ALERT] 295/094647 (579) : backend 'bk_db' has no server available!
2022-10-23T09:46:53.017 app[8e654884] fra [info] sentinel | 2022-10-23T09:46:53.015Z WARN cmd/sentinel.go:276 no keeper info available {"db": "41ea5c98", "keeper": "23c312b842"}
2022-10-23T09:46:53.017 app[8e654884] fra [info] sentinel | 2022-10-23T09:46:53.017Z ERROR cmd/sentinel.go:1009 no eligible masters
2022-10-23T09:47:05.075 app[8e654884] fra [info] sentinel | 2022-10-23T09:47:05.074Z WARN cmd/sentinel.go:276 no keeper info available {"db": "41ea5c98", "keeper": "23c312b842"}
2022-10-23T09:47:05.076 app[8e654884] fra [info] sentinel | 2022-10-23T09:47:05.075Z ERROR cmd/sentinel.go:1009 no eligible masters
2022-10-23T09:47:16.532 app[8e654884] fra [info] sentinel | 2022-10-23T09:47:16.528Z WARN cmd/sentinel.go:276 no keeper info available {"db": "41ea5c98", "keeper": "23c312b842"}
2022-10-23T09:47:16.532 app[8e654884] fra [info] sentinel | 2022-10-23T09:47:16.529Z ERROR cmd/sentinel.go:1009 no eligible masters
2022-10-23T09:47:29.534 app[8e654884] fra [info] sentinel | 2022-10-23T09:47:29.531Z WARN cmd/sentinel.go:276 no keeper info available {"db": "41ea5c98", "keeper": "23c312b842"}
2022-10-23T09:47:29.534 app[8e654884] fra [info] sentinel | 2022-10-23T09:47:29.533Z ERROR cmd/sentinel.go:1009 no eligible masters
2022-10-23T09:47:39.665 app[8e654884] fra [info] sentinel | 2022-10-23T09:47:39.661Z WARN cmd/sentinel.go:276 no keeper info available {"db": "41ea5c98", "keeper": "23c312b842"}
2022-10-23T09:47:39.668 app[8e654884] fra [info] sentinel | 2022-10-23T09:47:39.664Z ERROR cmd/sentinel.go:1009 no eligible masters
2022-10-23T09:47:49.541 app[8e654884] fra [info] sentinel | 2022-10-23T09:47:49.541Z WARN cmd/sentinel.go:276 no keeper info available {"db": "41ea5c98", "keeper": "23c312b842"}
2022-10-23T09:47:49.543 app[8e654884] fra [info] sentinel | 2022-10-23T09:47:49.542Z ERROR cmd/sentinel.go:1009 no eligible masters
2022-10-23T09:47:58.823 app[8e654884] fra [info] sentinel | 2022-10-23T09:47:58.822Z WARN cmd/sentinel.go:276 no keeper info available {"db": "41ea5c98", "keeper": "23c312b842"}
2022-10-23T09:47:58.835 app[8e654884] fra [info] sentinel | 2022-10-23T09:47:58.833Z ERROR cmd/sentinel.go:1009 no eligible masters
2022-10-23T09:48:09.769 app[8e654884] fra [info] sentinel | 2022-10-23T09:48:09.762Z WARN cmd/sentinel.go:276 no keeper info available {"db": "41ea5c98", "keeper": "23c312b842"}
2022-10-23T09:48:09.770 app[8e654884] fra [info] sentinel | 2022-10-23T09:48:09.769Z ERROR cmd/sentinel.go:1009 no eligible masters
2022-10-23T09:48:17.794 app[8e654884] fra [info] sentinel | 2022-10-23T09:48:17.793Z WARN cmd/sentinel.go:276 no keeper info available {"db": "41ea5c98", "keeper": "23c312b842"}
2022-10-23T09:48:17.796 app[8e654884] fra [info] sentinel | 2022-10-23T09:48:17.795Z ERROR cmd/sentinel.go:1009 no eligible masters
2022-10-23T09:48:27.122 app[8e654884] fra [info] sentinel | 2022-10-23T09:48:27.121Z WARN cmd/sentinel.go:276 no keeper info available {"db": "41ea5c98", "keeper": "23c312b842"}
2022-10-23T09:48:27.123 app[8e654884] fra [info] sentinel | 2022-10-23T09:48:27.123Z ERROR cmd/sentinel.go:1009 no eligible masters
2022-10-23T09:48:37.948 app[8e654884] fra [info] sentinel | 2022-10-23T09:48:37.947Z WARN cmd/sentinel.go:276 no keeper info available {"db": "41ea5c98", "keeper": "23c312b842"}
2022-10-23T09:48:37.952 app[8e654884] fra [info] sentinel | 2022-10-23T09:48:37.951Z ERROR cmd/sentinel.go:1009 no eligible masters
2022-10-23T09:48:50.746 app[8e654884] fra [info] sentinel | 2022-10-23T09:48:50.745Z WARN cmd/sentinel.go:276 no keeper info available {"db": "41ea5c98", "keeper": "23c312b842"}
2022-10-23T09:48:50.748 app[8e654884] fra [info] sentinel | 2022-10-23T09:48:50.747Z ERROR cmd/sentinel.go:1009 no eligible masters
2022-10-23T09:49:05.722 app[8e654884] fra [info] sentinel | 2022-10-23T09:49:05.719Z WARN cmd/sentinel.go:276 no keeper info available {"db": "41ea5c98", "keeper": "23c312b842"}
2022-10-23T09:49:05.740 app[8e654884] fra [info] sentinel | 2022-10-23T09:49:05.737Z ERROR cmd/sentinel.go:1009 no eligible masters
2022-10-23T09:49:16.668 app[8e654884] fra [info] sentinel | 2022-10-23T09:49:16.666Z WARN cmd/sentinel.go:276 no keeper info available {"db": "41ea5c98", "keeper": "23c312b842"}
2022-10-23T09:49:16.668 app[8e654884] fra [info] sentinel | 2022-10-23T09:49:16.668Z ERROR cmd/sentinel.go:1009 no eligible masters
2022-10-23T09:49:22.248 app[8e654884] fra [info] proxy | [WARNING] 295/094922 (579) : Backup Server bk_db/pg is UP, reason: Layer7 check passed, code: 200, check duration: 14ms. 0 active and 1 backup servers online. Running on backup. 0 sessions requeued, 0 total in queue.
2022-10-23T09:49:26.217 app[8e654884] fra [info] sentinel | 2022-10-23T09:49:26.215Z WARN cmd/sentinel.go:276 no keeper info available {"db": "41ea5c98", "keeper": "23c312b842"}
2022-10-23T09:49:26.219 app[8e654884] fra [info] sentinel | 2022-10-23T09:49:26.218Z ERROR cmd/sentinel.go:1009 no eligible masters
2022-10-23T09:49:38.802 app[8e654884] fra [info] sentinel | 2022-10-23T09:49:38.800Z WARN cmd/sentinel.go:276 no keeper info available {"db": "41ea5c98", "keeper": "23c312b842"}
2022-10-23T09:49:38.806 app[8e654884] fra [info] sentinel | 2022-10-23T09:49:38.803Z ERROR cmd/sentinel.go:1009 no eligible masters
2022-10-23T09:49:43.262 app[8e654884] fra [info] proxy | [WARNING] 295/094943 (579) : Backup Server bk_db/pg is DOWN, reason: Layer7 timeout, check duration: 5003ms. 0 active and 0 backup servers left. 4 sessions active, 0 requeued, 0 remaining in queue.
2022-10-23T09:49:43.262 app[8e654884] fra [info] proxy | [ALERT] 295/094943 (579) : backend 'bk_db' has no server available!
2022-10-23T09:49:48.519 app[8e654884] fra [info] sentinel | 2022-10-23T09:49:48.518Z WARN cmd/sentinel.go:276 no keeper info available {"db": "41ea5c98", "keeper": "23c312b842"}
2022-10-23T09:49:48.522 app[8e654884] fra [info] sentinel | 2022-10-23T09:49:48.520Z ERROR cmd/sentinel.go:1009 no eligible masters
2022-10-23T09:49:59.140 app[8e654884] fra [info] sentinel | 2022-10-23T09:49:59.140Z WARN cmd/sentinel.go:276 no keeper info available {"db": "41ea5c98", "keeper": "23c312b842"}
2022-10-23T09:49:59.142 app[8e654884] fra [info] sentinel | 2022-10-23T09:49:59.142Z ERROR cmd/sentinel.go:1009 no eligible masters
2022-10-23T09:49:59.861 app[8e654884] fra [info] proxy | [WARNING] 295/094959 (579) : Server bk_db/pg1 is UP, reason: Layer7 check passed, code: 200, check duration: 21ms. 1 active and 0 backup servers online. 0 sessions requeued, 0 total in queue.
2022-10-23T09:50:12.419 app[8e654884] fra [info] sentinel | 2022-10-23T09:50:12.418Z WARN cmd/sentinel.go:276 no keeper info available {"db": "41ea5c98", "keeper": "23c312b842"}
2022-10-23T09:50:12.421 app[8e654884] fra [info] sentinel | 2022-10-23T09:50:12.421Z ERROR cmd/sentinel.go:1009 no eligible masters
2022-10-23T09:50:20.871 app[8e654884] fra [info] proxy | [WARNING] 295/095020 (579) : Server bk_db/pg1 is DOWN, reason: Layer7 timeout, check duration: 5000ms. 0 active and 0 backup servers left. 3 sessions active, 0 requeued, 0 remaining in queue.
2022-10-23T09:50:20.871 app[8e654884] fra [info] proxy | [ALERT] 295/095020 (579) : backend 'bk_db' has no server available!
2022-10-23T09:50:21.146 app[8e654884] fra [info] sentinel | 2022-10-23T09:50:21.145Z WARN cmd/sentinel.go:276 no keeper info available {"db": "41ea5c98", "keeper": "23c312b842"}
2022-10-23T09:50:21.147 app[8e654884] fra [info] sentinel | 2022-10-23T09:50:21.147Z ERROR cmd/sentinel.go:1009 no eligible masters
2022-10-23T09:50:36.975 app[8e654884] fra [info] sentinel | 2022-10-23T09:50:36.972Z WARN cmd/sentinel.go:276 no keeper info available {"db": "41ea5c98", "keeper": "23c312b842"}
2022-10-23T09:50:36.975 app[8e654884] fra [info] sentinel | 2022-10-23T09:50:36.974Z ERROR cmd/sentinel.go:1009 no eligible masters
2022-10-23T09:50:43.512 app[8e654884] fra [info] sentinel | 2022-10-23T09:50:43.508Z WARN cmd/sentinel.go:276 no keeper info available {"db": "41ea5c98", "keeper": "23c312b842"}
2022-10-23T09:50:43.514 app[8e654884] fra [info] sentinel | 2022-10-23T09:50:43.514Z ERROR cmd/sentinel.go:1009 no eligible masters
2022-10-23T09:50:52.418 app[8e654884] fra [info] sentinel | 2022-10-23T09:50:52.417Z WARN cmd/sentinel.go:276 no keeper info available {"db": "41ea5c98", "keeper": "23c312b842"}
2022-10-23T09:50:52.426 app[8e654884] fra [info] sentinel | 2022-10-23T09:50:52.423Z ERROR cmd/sentinel.go:1009 no eligible masters
2022-10-23T09:50:54.933 app[8e654884] fra [info] proxy | [WARNING] 295/095054 (579) : Server bk_db/pg1 is UP, reason: Layer7 check passed, code: 200, check duration: 12ms. 1 active and 0 backup servers online. 0 sessions requeued, 0 total in queue.
2022-10-23T09:51:05.645 app[8e654884] fra [info] sentinel | 2022-10-23T09:51:05.644Z WARN cmd/sentinel.go:276 no keeper info available {"db": "41ea5c98", "keeper": "23c312b842"}
2022-10-23T09:51:05.647 app[8e654884] fra [info] sentinel | 2022-10-23T09:51:05.647Z ERROR cmd/sentinel.go:1009 no eligible masters
2022-10-23T09:51:13.938 app[8e654884] fra [info] sentinel | 2022-10-23T09:51:13.937Z WARN cmd/sentinel.go:276 no keeper info available {"db": "41ea5c98", "keeper": "23c312b842"}
2022-10-23T09:51:13.946 app[8e654884] fra [info] sentinel | 2022-10-23T09:51:13.946Z ERROR cmd/sentinel.go:1009 no eligible masters
2022-10-23T09:51:15.942 app[8e654884] fra [info] proxy | [WARNING] 295/095115 (579) : Server bk_db/pg1 is DOWN, reason: Layer7 timeout, check duration: 5002ms. 0 active and 0 backup servers left. 4 sessions active, 0 requeued, 0 remaining in queue.
2022-10-23T09:51:15.942 app[8e654884] fra [info] proxy | [ALERT] 295/095115 (579) : backend 'bk_db' has no server available!
2022-10-23T09:51:22.103 app[8e654884] fra [info] sentinel | 2022-10-23T09:51:22.102Z WARN cmd/sentinel.go:276 no keeper info available {"db": "41ea5c98", "keeper": "23c312b842"}
2022-10-23T09:51:22.104 app[8e654884] fra [info] sentinel | 2022-10-23T09:51:22.104Z ERROR cmd/sentinel.go:1009 no eligible masters
2022-10-23T09:51:33.230 app[8e654884] fra [info] sentinel | 2022-10-23T09:51:33.230Z WARN cmd/sentinel.go:276 no keeper info available {"db": "41ea5c98", "keeper": "23c312b842"}
2022-10-23T09:51:33.231 app[8e654884] fra [info] sentinel | 2022-10-23T09:51:33.231Z ERROR cmd/sentinel.go:1009 no eligible masters
2022-10-23T09:51:47.073 app[8e654884] fra [info] sentinel | 2022-10-23T09:51:47.071Z WARN cmd/sentinel.go:276 no keeper info available {"db": "41ea5c98", "keeper": "23c312b842"}
2022-10-23T09:51:47.076 app[8e654884] fra [info] sentinel | 2022-10-23T09:51:47.074Z ERROR cmd/sentinel.go:1009 no eligible masters
2022-10-23T09:51:58.942 app[8e654884] fra [info] sentinel | 2022-10-23T09:51:58.939Z WARN cmd/sentinel.go:276 no keeper info available {"db": "41ea5c98", "keeper": "23c312b842"}
2022-10-23T09:51:58.942 app[8e654884] fra [info] sentinel | 2022-10-23T09:51:58.940Z ERROR cmd/sentinel.go:1009 no eligible masters
2022-10-23T09:52:11.168 app[8e654884] fra [info] sentinel | 2022-10-23T09:52:11.166Z WARN cmd/sentinel.go:276 no keeper info available {"db": "41ea5c98", "keeper": "23c312b842"}
2022-10-23T09:52:11.171 app[8e654884] fra [info] sentinel | 2022-10-23T09:52:11.168Z ERROR cmd/sentinel.go:1009 no eligible masters
2022-10-23T09:52:18.152 app[8e654884] fra [info] sentinel | 2022-10-23T09:52:18.151Z WARN cmd/sentinel.go:276 no keeper info available {"db": "41ea5c98", "keeper": "23c312b842"}
2022-10-23T09:52:18.153 app[8e654884] fra [info] sentinel | 2022-10-23T09:52:18.153Z ERROR cmd/sentinel.go:1009 no eligible masters
2022-10-23T09:52:27.090 app[8e654884] fra [info] sentinel | 2022-10-23T09:52:27.086Z WARN cmd/sentinel.go:276 no keeper info available {"db": "41ea5c98", "keeper": "23c312b842"}
2022-10-23T09:52:27.098 app[8e654884] fra [info] sentinel | 2022-10-23T09:52:27.097Z ERROR cmd/sentinel.go:1009 no eligible masters
2022-10-23T09:52:36.802 app[8e654884] fra [info] sentinel | 2022-10-23T09:52:36.798Z WARN cmd/sentinel.go:276 no keeper info available {"db": "41ea5c98", "keeper": "23c312b842"}
2022-10-23T09:52:36.809 app[8e654884] fra [info] sentinel | 2022-10-23T09:52:36.804Z ERROR cmd/sentinel.go:1009 no eligible masters
2022-10-23T09:52:44.546 app[8e654884] fra [info] sentinel | 2022-10-23T09:52:44.545Z WARN cmd/sentinel.go:276 no keeper info available {"db": "41ea5c98", "keeper": "23c312b842"}
2022-10-23T09:52:44.548 app[8e654884] fra [info] sentinel | 2022-10-23T09:52:44.548Z ERROR cmd/sentinel.go:1009 no eligible masters
2022-10-23T09:52:52.081 app[8e654884] fra [info] sentinel | 2022-10-23T09:52:52.080Z WARN cmd/sentinel.go:276 no keeper info available {"db": "41ea5c98", "keeper": "23c312b842"}
2022-10-23T09:52:52.082 app[8e654884] fra [info] sentinel | 2022-10-23T09:52:52.081Z ERROR cmd/sentinel.go:1009 no eligible masters

Tried to rescale the db after seeing those issues, but stuck at

2022-10-23T09:53:13.718 runner[edbb0830] fra [info] Starting instance
2022-10-23T09:53:19.243 runner[edbb0830] fra [info] Configuring virtual machine
2022-10-23T09:53:19.248 runner[edbb0830] fra [info] Pulling container image
2022-10-23T09:53:21.638 runner[edbb0830] fra [info] Unpacking image
2022-10-23T09:53:21.698 runner[edbb0830] fra [info] Preparing kernel init
2022-10-23T09:53:22.024 runner[edbb0830] fra [info] Setting up volume 'pg_data'
2022-10-23T09:53:22.299 runner[edbb0830] fra [info] Configuring firecracker
2022-10-23T09:53:22.785 runner[edbb0830] fra [info] Starting virtual machine

After rescaling again, it works. I’d love to understand where these kinds of issues arise from.

You need to check all the logs, but from my experience this happens (no eligible masters) when the Postgres process is killed by “out of memory”, something like:

Out of memory: Killed process 5353 (postgres) total-vm:974696kB, anon-rss:188968kB, file-rss:0kB, shmem-rss:550520kB, UID:1000 pgtables:1784kB oom_score_adj:0

Some checks will start to fail (you can check using flyctl checks list -a APP_NAME_HERE).

Maybe the app should be restarted to be back in a good state, but I don’t know how to configure it to restart when the Postgres is killed.

One final note, you should check why are you database needing so much memory, in my case after a few weeks monitoring I have found that is related to long running queries/“stuck open connections” after a app that was using the database was killed by oom and for some reason Postgres would not detect it (I think is related to this: TCP keepalive for a better PostgreSQL experience - CYBERTEC)

Hi, these are excellent advices, thanks. I’m pretty sure that these are not the issues though. I’ve already reduced risks for long running queries or idle connections by adjusting the configuration and setting low timeouts.

The first error was:

Server bk_db/pg1 is DOWN, reason: Layer7 timeout, check duration: 5001ms. 0 active and 1 backup servers left. Running on backup. 103 sessions active, 0 requeued, 0 remaining in queue.

That error is saying the built in haproxy can’t communicate with the postgres leader. Unfortunately, the logs in the post don’t really say why it is unavailable. fly checks list will usually show you what’s going on.

I looked through the previous logs and there was a huge burst of messages like this before it went unavailable:

ERROR:  canceling statement due to statement timeout

It’s most likely that the Postgres process crashed due to some kind of load (or deadlocked) and it didn’t restart properly.

fly image update might have fixes that will help with this case, and should also have less chatty logs.

Alright, thanks, will run the check next time I encounter issues. Also, I updated the image. Thanks for checking and for you help @crossworth and @kurt!