For anyone interested I wasn’t able to track down what was wrong but I did resolve the problem.
I initially tried restarting postgres but the command errored:
❯ flyctl pg restart --config fly/db.toml
Update available 0.0.463 -> 0.0.473.
Run "flyctl version update" to upgrade.
Error no active leader found
I tried upgrading the image hoping it would force a restart but unfortunately nope:
❯ flyctl image update --config fly/db.toml
Update available 0.0.463 -> 0.0.474.
Run "flyctl version update" to upgrade.
The following changes will be applied to all Postgres machines.
Machines not running the official Postgres image will be skipped.
... // 3 identical lines
},
"init": {},
- "image": "flyio/postgres:14.6",
+ "image": "registry-1.docker.io/flyio/postgres:14.6@sha256:9cfb3fafcc1b9bc2df7c901d2ae4a81e83ba224bfe79b11e4dc11bb1838db46e",
"metadata": {
"fly-managed-postgres": "true",
... // 46 identical lines
? Apply changes? Yes
Identifying cluster role(s)
Machine 73d8d3d6a72389: error
Postgres cluster has been successfully updated!
I think thought I’d try and scale down and then back up - I tried scaling but that errored:
❯ flyctl scale count 2 --config fly/db.toml
Update available 0.0.463 -> 0.0.474.
Run "flyctl version update" to upgrade.
Error it looks like your app is running on v2 of our platform, and does not support this legacy command: try running fly machine clone instead
The v2 platform doesn’t seem to support scaling but there was a restart
command:
❯ flyctl machine restart 73d8d3d6a72389 --config fly/db.toml
Update available 0.0.463 -> 0.0.474.
Run "flyctl version update" to upgrade.
Restarting machine 73d8d3d6a72389
Waiting for 73d8d3d6a72389 to become healthy (started, 3/3)
Machine 73d8d3d6a72389 restarted successfully!
And we’re back to being healthy:
❯ flyctl checks list --config fly/db.toml
Update available 0.0.463 -> 0.0.474.
Run "flyctl version update" to upgrade.
Health Checks for solitary-sun-2613
NAME | STATUS | MACHINE | LAST UPDATED | OUTPUT
-------*---------*----------------*----------------------*--------------------------------------------------------------------------
pg | passing | 73d8d3d6a72389 | 54s ago | [✓] transactions: read/write (245.12µs)
| | | | [✓] connections: 13 used, 3 reserved, 300 max (5.43ms)
-------*---------*----------------*----------------------*--------------------------------------------------------------------------
role | passing | 73d8d3d6a72389 | 57s ago | leader
-------*---------*----------------*----------------------*--------------------------------------------------------------------------
vm | passing | 73d8d3d6a72389 | 2023-02-23T11:08:33Z | [✓] checkDisk: 827.39 MB (84.8%) free space on /data/ (60.61µs)
| | | | [✓] checkLoad: load averages: 0.05 0.16 0.31 (109.21µs)
| | | | [✓] memory: system spent 0s of the last 60s waiting on memory (37.74µs)
| | | | [✓] cpu: system spent 5.75s of the last 60s waiting on cpu (23.74µs)
| | | | [✓] io: system spent 60ms of the last 60s waiting on io (22.24µs)
-------*---------*----------------*----------------------*-------------------------------------------------------------------------
I don’t know why this fixed the problem or what the problem was but it is now resolved.