Our prod postgres instance suddenly went into a pending state out of the blue and won’t recover.
fly status -a [redacted]
App
Name = [redacted]
Owner = [redacted]
Version = 0
Status = pending
Hostname = [redacted]
Deployment Status
ID = d27f018e-3eb6-be78-6c73-ff495429133b
Version = v0
Status = successful
Description = Deployment completed successfully
Instances = 2 desired, 2 placed, 2 healthy, 0 unhealthy
Instances
ID VERSION REGION DESIRED STATUS HEALTH CHECKS RESTARTS CREATED
And logs output
2021-04-10T17:52:13.958Z f8fc2ce6 ams [info] sentinel | 2021-04-10T17:52:13.953Z WARN cmd/sentinel.go:276 no keeper info available {"db": "873c68a3", "keeper": "fdaa01850a7baa00150a2"}
2021-04-10T17:52:22.644Z f8fc2ce6 ams [info] sentinel | 2021-04-10T17:52:22.637Z WARN cmd/sentinel.go:276 no keeper info available {"db": "873c68a3", "keeper": "fdaa01850a7baa00150a2"}
2021-04-10T17:52:31.320Z f8fc2ce6 ams [info] sentinel | 2021-04-10T17:52:31.313Z WARN cmd/sentinel.go:276 no keeper info available {"db": "873c68a3", "keeper": "fdaa01850a7baa00150a2"}
2021-04-10T17:52:39.281Z f8fc2ce6 ams [info] Shutting down virtual machine
2021-04-10T17:52:39.999Z f8fc2ce6 ams [info] sentinel | 2021-04-10T17:52:39.992Z WARN cmd/sentinel.go:276 no keeper info available {"db": "873c68a3", "keeper": "fdaa01850a7baa00150a2"}
2021-04-10T17:52:45.163Z f8fc2ce6 ams [info] Sending signal SIGTERM to main child process w/ PID 509
2021-04-10T17:52:45.177Z f8fc2ce6 ams [info] postgres_exporter | Interrupting...
2021-04-10T17:52:45.178Z f8fc2ce6 ams [info] keeper | Interrupting...
2021-04-10T17:52:45.179Z f8fc2ce6 ams [info] sentinel | Interrupting...
2021-04-10T17:52:45.180Z f8fc2ce6 ams [info] proxy | Interrupting...
2021-04-10T17:52:45.182Z f8fc2ce6 ams [info] sentinel | 2021-04-10T17:52:45.168Z INFO cmd/sentinel.go:1816 stopping stolon sentinel
2021-04-10T17:52:45.210Z f8fc2ce6 ams [info] keeper | 2021-04-10 17:52:45.193 UTC [614] LOG: received SIGHUP, reloading configuration files
2021-04-10T17:52:45.222Z f8fc2ce6 ams [info] postgres_exporter | Exited
2021-04-10T17:52:45.225Z f8fc2ce6 ams [info] proxy | Exited
2021-04-10T17:52:45.226Z f8fc2ce6 ams [info] sentinel | Exited
2021-04-10T17:52:46.180Z f8fc2ce6 ams [info] Reaped child process with pid: 560 and signal: SIGHUP, core dumped? false
2021-04-10T17:52:46.181Z f8fc2ce6 ams [info] Reaped child process with pid: 614, exit code: 0
2021-04-10T17:52:46.225Z f8fc2ce6 ams [info] keeper | Exited
2021-04-10T17:52:48.188Z f8fc2ce6 ams [info] Main child exited normally with code: 0
2021-04-10T17:52:48.189Z f8fc2ce6 ams [info] Starting clean up.
2021-04-10T17:52:48.189Z f8fc2ce6 ams [info] Reaped child process with pid: 557, exit code: 0
2021-04-10T17:52:48.202Z f8fc2ce6 ams [info] Umounting /dev/vdc from /data
2021-04-10T17:52:50.295Z f8fc2ce6 ams [info] Starting instance
2021-04-10T17:52:50.344Z f8fc2ce6 ams [info] Configuring virtual machine
2021-04-10T17:52:50.353Z f8fc2ce6 ams [info] Pulling container image
2021-04-10T17:52:51.511Z f8fc2ce6 ams [info] Unpacking image
2021-04-10T17:52:51.534Z f8fc2ce6 ams [info] Preparing kernel init
2021-04-10T17:52:51.932Z f8fc2ce6 ams [info] Setting up volume 'pg_data'
2021-04-10T17:52:52.399Z f8fc2ce6 ams [info] Configuring firecracker
2021-04-10T17:52:54.539Z f8fc2ce6 ams [info] Starting virtual machine
2021-04-10T17:52:54.733Z f8fc2ce6 ams [info] Starting init (commit: 0512da4)...
2021-04-10T17:52:54.764Z f8fc2ce6 ams [info] Mounting /dev/vdc at /data
2021-04-10T17:52:54.769Z f8fc2ce6 ams [info] Running: `docker-entrypoint.sh /fly/start.sh` as root
2021-04-10T17:52:54.797Z f8fc2ce6 ams [info] 2021/04/10 17:52:54 listening on [fdaa:0:1850:a7b:aa3:0:1509:2]:22 (DNS: [fdaa::3]:53)
2021-04-10T17:52:54.989Z f8fc2ce6 ams [info] system | Tmux socket name: overmind-fly-mbrlW8IIpZWXzJ87RFl0eb
2021-04-10T17:52:54.991Z f8fc2ce6 ams [info] system | Tmux session ID: fly
2021-04-10T17:52:54.992Z f8fc2ce6 ams [info] system | Listening at ./.overmind.sock
2021-04-10T17:52:55.092Z f8fc2ce6 ams [info] keeper | Started with pid 554...
2021-04-10T17:52:55.097Z f8fc2ce6 ams [info] sentinel | Started with pid 557...
2021-04-10T17:52:55.098Z f8fc2ce6 ams [info] proxy | Started with pid 559...
2021-04-10T17:52:55.099Z f8fc2ce6 ams [info] postgres_exporter | Started with pid 563...
2021-04-10T17:52:55.226Z f8fc2ce6 ams [info] postgres_exporter | INFO[0000] Starting Server: :9187 source="postgres_exporter.go:1837"
2021-04-10T17:52:55.478Z f8fc2ce6 ams [info] sentinel | 2021-04-10T17:52:55.471Z INFO cmd/sentinel.go:2000 sentinel uid {"uid": "5852a8e4"}
2021-04-10T17:52:57.403Z f8fc2ce6 ams [info] sentinel | 2021-04-10T17:52:57.398Z INFO cmd/sentinel.go:82 Trying to acquire sentinels leadership
2021-04-10T17:52:57.441Z f8fc2ce6 ams [info] keeper | 2021-04-10T17:52:57.437Z ERROR cmd/keeper.go:688 cannot get configured pg parameters {"error": "dial unix /tmp/.s.PGSQL.5433: connect: no such file or directory"}
2021-04-10T17:52:57.672Z f8fc2ce6 ams [info] sentinel | 2021-04-10T17:52:57.668Z INFO cmd/sentinel.go:89 sentinel leadership acquired
2021-04-10T17:52:59.953Z f8fc2ce6 ams [info] keeper | 2021-04-10T17:52:59.945Z ERROR cmd/keeper.go:688 cannot get configured pg parameters {"error": "dial unix /tmp/.s.PGSQL.5433: connect: no such file or directory"}
2021-04-10T17:53:02.462Z f8fc2ce6 ams [info] keeper | 2021-04-10T17:53:02.454Z ERROR cmd/keeper.go:688 cannot get configured pg parameters {"error": "dial unix /tmp/.s.PGSQL.5433: connect: no such file or directory"}
2021-04-10T17:53:04.964Z f8fc2ce6 ams [info] keeper | 2021-04-10T17:53:04.956Z ERROR cmd/keeper.go:688 cannot get configured pg parameters {"error": "dial unix /tmp/.s.PGSQL.5433: connect: no such file or directory"}
2021-04-10T17:53:06.876Z f8fc2ce6 ams [info] sentinel | 2021-04-10T17:53:06.868Z WARN cmd/sentinel.go:276 no keeper info available {"db": "873c68a3", "keeper": "fdaa01850a7baa00150a2"}
2021-04-10T17:53:07.464Z f8fc2ce6 ams [info] keeper | 2021-04-10T17:53:07.456Z ERROR cmd/keeper.go:688 cannot get configured pg parameters {"error": "dial unix /tmp/.s.PGSQL.5433: connect: no such file or directory"}
2021-04-10T17:53:09.964Z f8fc2ce6 ams [info] keeper | 2021-04-10T17:53:09.957Z ERROR cmd/keeper.go:688 cannot get configured pg parameters {"error": "dial unix /tmp/.s.PGSQL.5433: connect: no such file or directory"}
2021-04-10T17:53:12.466Z f8fc2ce6 ams [info] keeper | 2021-04-10T17:53:12.458Z ERROR cmd/keeper.go:688 cannot get configured pg parameters {"error": "dial unix /tmp/.s.PGSQL.5433: connect: no such file or directory"}
2021-04-10T17:53:12.868Z f8fc2ce6 ams [info] keeper | 2021-04-10 17:53:12.864 UTC [611] LOG: starting PostgreSQL 12.5 (Debian 12.5-1.pgdg100+1) on x86_64-pc-linux-gnu, compiled by gcc (Debian 8.3.0-6) 8.3.0, 64-bit
2021-04-10T17:53:12.871Z f8fc2ce6 ams [info] keeper | 2021-04-10 17:53:12.868 UTC [611] LOG: listening on IPv6 address "fdaa:0:1850:a7b:aa3:0:1509:2", port 5433
2021-04-10T17:53:12.874Z f8fc2ce6 ams [info] keeper | 2021-04-10 17:53:12.872 UTC [611] LOG: listening on Unix socket "/tmp/.s.PGSQL.5433"
2021-04-10T17:53:12.907Z f8fc2ce6 ams [info] keeper | 2021-04-10 17:53:12.904 UTC [613] LOG: database system was shut down at 2021-04-10 17:52:45 UTC
2021-04-10T17:53:12.917Z f8fc2ce6 ams [info] keeper | 2021-04-10 17:53:12.914 UTC [611] LOG: database system is ready to accept connections
2021-04-10T17:53:15.550Z f8fc2ce6 ams [info] sentinel | 2021-04-10T17:53:15.544Z WARN cmd/sentinel.go:276 no keeper info available {"db": "873c68a3", "keeper": "fdaa01850a7baa00150a2"}
2021-04-10T17:53:17.171Z f8fc2ce6 ams [info] postgres_exporter | INFO[0021] Established new database connection to "fdaa:0:1850:a7b:aa3:0:1509:2:5433". source="postgres_exporter.go:970"
2021-04-10T17:53:17.190Z f8fc2ce6 ams [info] postgres_exporter | INFO[0021] Semantic Version Changed on "fdaa:0:1850:a7b:aa3:0:1509:2:5433": 0.0.0 -> 12.5.0 source="postgres_exporter.go:1539"
2021-04-10T17:53:17.242Z f8fc2ce6 ams [info] postgres_exporter | INFO[0022] Established new database connection to "fdaa:0:1850:a7b:aa3:0:1509:2:5433". source="postgres_exporter.go:970"
2021-04-10T17:53:17.254Z f8fc2ce6 ams [info] postgres_exporter | INFO[0022] Semantic Version Changed on "fdaa:0:1850:a7b:aa3:0:1509:2:5433": 0.0.0 -> 12.5.0 source="postgres_exporter.go:1539"
2021-04-10T17:53:21.221Z f8fc2ce6 ams [info] sentinel | 2021-04-10T17:53:21.215Z WARN cmd/sentinel.go:276 no keeper info available {"db": "873c68a3", "keeper": "fdaa01850a7baa00150a2"}
2021-04-10T17:53:26.891Z f8fc2ce6 ams [info] sentinel | 2021-04-10T17:53:26.884Z WARN cmd/sentinel.go:276 no keeper info available {"db": "873c68a3", "keeper": "fdaa01850a7baa00150a2"}
2021-04-10T17:53:35.572Z f8fc2ce6 ams [info] sentinel | 2021-04-10T17:53:35.567Z WARN cmd/sentinel.go:276 no keeper info available {"db": "873c68a3", "keeper": "fdaa01850a7baa00150a2"}
2021-04-10T17:53:41.241Z f8fc2ce6 ams [info] sentinel | 2021-04-10T17:53:41.235Z WARN cmd/sentinel.go:276 no keeper info available {"db": "873c68a3", "keeper": "fdaa01850a7baa00150a2"}
2021-04-10T17:53:46.928Z f8fc2ce6 ams [info] sentinel | 2021-04-10T17:53:46.922Z WARN cmd/sentinel.go:276 no keeper info available {"db": "873c68a3", "keeper": "fdaa01850a7baa00150a2"}
2021-04-10T17:53:55.615Z f8fc2ce6 ams [info] sentinel | 2021-04-10T17:53:55.610Z WARN cmd/sentinel.go:276 no keeper info available {"db": "873c68a3", "keeper": "fdaa01850a7baa00150a2"}
2021-04-10T17:54:04.283Z f8fc2ce6 ams [info] sentinel | 2021-04-10T17:54:04.276Z WARN cmd/sentinel.go:276 no keeper info available {"db": "873c68a3", "keeper": "fdaa01850a7baa00150a2"}
2021-04-10T17:54:09.960Z f8fc2ce6 ams [info] sentinel | 2021-04-10T17:54:09.954Z WARN cmd/sentinel.go:276 no keeper info available {"db": "873c68a3", "keeper": "fdaa01850a7baa00150a2"}
2021-04-10T17:54:18.778Z f8fc2ce6 ams [info] sentinel | 2021-04-10T17:54:18.774Z WARN cmd/sentinel.go:276 no keeper info available {"db": "873c68a3", "keeper": "fdaa01850a7baa00150a2"}
2021-04-10T17:54:27.454Z f8fc2ce6 ams [info] sentinel | 2021-04-10T17:54:27.449Z WARN cmd/sentinel.go:276 no keeper info available {"db": "873c68a3", "keeper": "fdaa01850a7baa00150a2"}
2021-04-10T17:54:36.241Z f8fc2ce6 ams [info] sentinel | 2021-04-10T17:54:36.235Z WARN cmd/sentinel.go:276 no keeper info available {"db": "873c68a3", "keeper": "fdaa01850a7baa00150a2"}
2021-04-10T17:54:40.440Z f8fc2ce6 ams [info] Shutting down virtual machine
2021-04-10T17:54:44.914Z f8fc2ce6 ams [info] sentinel | 2021-04-10T17:54:44.906Z WARN cmd/sentinel.go:276 no keeper info available {"db": "873c68a3", "keeper": "fdaa01850a7baa00150a2"}
2021-04-10T17:54:48.760Z f8fc2ce6 ams [info] Sending signal SIGTERM to main child process w/ PID 509
2021-04-10T17:54:48.778Z f8fc2ce6 ams [info] sentinel | Interrupting...
2021-04-10T17:54:48.779Z f8fc2ce6 ams [info] proxy | Interrupting...
2021-04-10T17:54:48.780Z f8fc2ce6 ams [info] postgres_exporter | Interrupting...
2021-04-10T17:54:48.781Z f8fc2ce6 ams [info] keeper | Interrupting...
2021-04-10T17:54:48.783Z f8fc2ce6 ams [info] sentinel | 2021-04-10T17:54:48.761Z INFO cmd/sentinel.go:1816 stopping stolon sentinel
2021-04-10T17:54:48.801Z f8fc2ce6 ams [info] postgres_exporter | Exited
2021-04-10T17:54:48.892Z f8fc2ce6 ams [info] sentinel | Exited
2021-04-10T17:54:48.894Z f8fc2ce6 ams [info] proxy | Exited
2021-04-10T17:54:49.781Z f8fc2ce6 ams [info] Reaped child process with pid: 556 and signal: SIGHUP, core dumped? false
2021-04-10T17:54:49.782Z f8fc2ce6 ams [info] Reaped child process with pid: 611, exit code: 0
2021-04-10T17:54:49.785Z f8fc2ce6 ams [info] Reaped child process with pid: 1010 and signal: SIGPIPE, core dumped? false
2021-04-10T17:54:49.790Z f8fc2ce6 ams [info] keeper | Exited
2021-04-10T17:54:51.790Z f8fc2ce6 ams [info] Main child exited normally with code: 0
2021-04-10T17:54:51.792Z f8fc2ce6 ams [info] Reaped child process with pid: 553, exit code: 0
2021-04-10T17:54:51.793Z f8fc2ce6 ams [info] Starting clean up.
2021-04-10T17:54:51.809Z f8fc2ce6 ams [info] Umounting /dev/vdc from /data
2021-04-10T18:28:38.797Z b6d6bbf9 ams [info] Starting instance
2021-04-10T18:28:38.853Z b6d6bbf9 ams [info] Configuring virtual machine
2021-04-10T18:28:38.860Z b6d6bbf9 ams [info] Pulling container image
2021-04-10T18:28:40.032Z b6d6bbf9 ams [info] Unpacking image
2021-04-10T18:28:40.056Z b6d6bbf9 ams [info] Preparing kernel init
2021-04-10T18:28:40.274Z b6d6bbf9 ams [info] Setting up volume 'pg_data'
2021-04-10T18:28:40.701Z b6d6bbf9 ams [info] Configuring firecracker
Please advise…