Unable to restore from snapshot - just an FYI

Hi,

I just wanted to leave a note about this in general: restoring from snapshots doesn’t work in some cases.

Background: I attempted to scale my DB from a single node to 2x nodes at your smallest VM sizes in two regions (first CDG, then FRA). This seemed to work fine, but then I tried to scale the machines to the second smallest shared VM size. The leader machine got stuck in some sort of boot loop, flipping between started and failed. I eventually deleted the leader machine, hoping the second would take over. This didn’t happen, so I tried to restore from various snapshots, both before and after my scaling attempts, and it repeatedly failed.

In the end, I just deleted the entire app and manually restored from a local pg_dump I had created before trying to scale up the DB instances.

Snapshot Context: The snapshot is from a Postgres VM created on Apr 6, 2023 14:08 UTC. I’ve tried various snapshots(from before and after I attempted to scale) from August 29 to Sept 2nd.

The primary logs of interest are linked, and the error message I got from the newly spun-up attempt to restore a snapshot is below.

2023-09-04T13:53:19Z app[xxx] fra [info]postgres | 2023-09-04 13:53:19.943 UTC [344] LOG:  database system was shut down at 2023-09-04 13:53:19 UTC
2023-09-04T13:53:19Z app[xxx] fra [info]postgres | 2023-09-04 13:53:19.947 UTC [325] LOG:  database system is ready to accept connections
2023-09-04T13:53:19Z app[xxx] fra [info]proxy    | [NOTICE]   (306) : New worker (348) forked
2023-09-04T13:53:19Z app[xxx] fra [info]proxy    | [NOTICE]   (306) : Loading success.
2023-09-04T13:53:19Z app[xxx] fra [info]repmgrd  | [2023-09-04 13:53:19] [NOTICE] repmgrd (repmgrd 5.3.3) starting up
2023-09-04T13:53:19Z app[xxx] fra [info]repmgrd  | [2023-09-04 13:53:19] [INFO] connecting to database "host=fdaa:1:da0b:a7b:15b:7f3e:bdce:2 port=5433 user=repmgr dbname=repmgr connect_timeout=5"
2023-09-04T13:53:19Z app[xxx] fra [info]postgres | 2023-09-04 13:53:19.982 UTC [350] FATAL:  database "repmgr" does not exist
2023-09-04T13:53:19Z app[xxx] fra [info]proxy    | [WARNING]  (348) : bk_db/pg1 changed its IP from (none) to fdaa:1:da0b:a7b:15b:7f3e:bdce:2 by flydns/dns1.
2023-09-04T13:53:19Z app[xxx] fra [info]proxy    | [WARNING]  (348) : Server bk_db/pg1 ('fra.falling-surf-3933.internal') is UP/READY (resolves again).
2023-09-04T13:53:19Z app[xxx] fra [info]proxy    | [WARNING]  (348) : Server bk_db/pg1 administratively READY thanks to valid DNS answer.
2023-09-04T13:53:19Z app[xxx] fra [info]repmgrd  | [2023-09-04 13:53:19] [ERROR] connection to database failed
2023-09-04T13:53:19Z app[xxx] fra [info]repmgrd  | [2023-09-04 13:53:19] [DETAIL]
2023-09-04T13:53:19Z app[xxx] fra [info]repmgrd  | connection to server at "fdaa:1:da0b:a7b:15b:7f3e:bdce:2", port 5433 failed: FATAL:  database "repmgr" does not exist
2023-09-04T13:53:19Z app[xxx] fra [info]repmgrd  |
2023-09-04T13:53:19Z app[xxx] fra [info]repmgrd  | [2023-09-04 13:53:19] [DETAIL] attempted to connect using:
2023-09-04T13:53:19Z app[xxx] fra [info]repmgrd  |   user=repmgr connect_timeout=5 dbname=repmgr host=fdaa:1:da0b:a7b:15b:7f3e:bdce:2 port=5433 fallback_application_name=repmgr options=-csearch_path=
2023-09-04T13:53:19Z app[xxx] fra [info]repmgrd  | exit status 6

P.S. Would be great to allow attaching text files in this forum.

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.