Point in Time Recovery for Postgres Flex using Barman

Strange… I’d suggest starting a new top-level topic for this, so others can find it more easily—and maybe chime in themselves.

(I’ll try to help if I can, as well.)

1 Like

@mayailurus after a couple of days of debugging and working with the very helpful folks at fly, we discovered that our db cluster was corrupted somehow. We forked our database and were consequently able to successfully SSH between machines and successfully perform a barman recovery! Case closed.

1 Like

I’m running into this now after upgrading from the pg-flex 15.3->15.6 image. Unfortunately, forking isn’t an answer since that would be a lot of downtime and otherwise the cluster is running fine. Yikes

1 Like

This means for a 30 day retention window your barman volume needs to be over 30× the size of your postgres volumes.

I only think this would be true if you were at capacity on your postgres volume. Although the barman volume needs to be at least as large as your postgres volume, each backup only seems to use the storage necessary for the data being backed up (which could be quite small). That’s what it seems to do when I do a barman backup but maybe I’m missing something though!

That’s correct, but it still means that you need a relatively large volume to ensure you have healthy free space to keep a full retention of snapshots and WALs.

For example, a database with a size (as reported by repmgr) of 778MB requires 21GB of storage to keep a 30 day retention window of daily snapshots + WALs. As the database grows larger that storage requirement for backups grows even quicker.

Having delta snapshots would reduce the storage requirements for barman dramatically for most uses, but it is more complex to configure and maybe more difficult to ship as part of postgres-flex.

The previous solution of wal-g worked great as everything was streamed to offsite storage (such as S3), both snapshots and WALs, compressed.

1 Like