Hi —
My fly dashboard says:
Service Interruption, 7 hours ago
We are performing emergency maintenance on a host some of your apps instances are running on. Apps may be unavailable until the maintenance is completed.
The machine and its volumes are shown as unavailable in the dash; I can’t run any commands against them or attach new machines. So, presumably, I should recreate the database from a snapshot, then detach and attach:
fly postgres create --snapshot-id vs_XX
[…]
Restoring 1 of 1 machines with image flyio/postgres-flex:15.6@sha256:XX
Waiting for machine to start…
Machine XX is created
==> Monitoring health checks
Waiting for XX to become healthy (started, 1/3)
Error: context deadline exceeded
I’ve tried a bunch of times. Different locations, different machine sizes, different snapshots. None go healthy, all timeout. So let’s look at the pg logs to see why it’s not healthy:
FATAL: database “repmgr” does not exist
WARNING: database “postgres” has a collation version mismatch
DETAIL: The database was created using collation version 2.31, but the operating system provides version 2.36.
HINT: Rebuild all objects in this database that use the default collation and run ALTER DATABASE postgres REFRESH COLLATION VERSION, or build PostgreSQL with the right library version.
Registering primary
ERROR: template database “template1” has a collation version mismatch
DETAIL: The template database was created using collation version 2.31, but the operating system provides version 2.36.
That sounds like mismatch between the OS and the pg build, neither of which I believe I can control. But could I work around it by connecting to the unhealthy database and running that command?
fly pg connect -a XX
Error: no active leader found
Any bright ideas appreciated. Looks like the same as this unresolved post.
Edit: solved it, leaving this here in case it’s helpful.
- Look the various versions of
postgres-flex
, shown here - Try restoring against different versions, ie.
fly pg create --snapshot-id vs_XX --image-ref flyio/postgres-flex:15.1@sha256:4af8e07ae57ff7d31228b32ceebd34bf7508c131bc86f67c2025c669b56eff70
- Eventually get it right.