Is postgres on Fly ready to host production databases ?

Any updates on when Postgres on Fly will be out of beta / production-ready?

We’re prepping to migrate our core systems away from AWS (Fargate and Aurora Postgres and a couple smaller services) in a few weeks. Fly is my top pick but I’m a bit concerned about the database side of things

1 Like

Our last remaining todo is self service backup restores. We’re very close!

We are happy with reliability, I think our Postgres clusters are a good place to host production data right now. We just think people need to be able to restore clusters without talking to us before we can remove the “beta” label.

6 Likes

Sorry if I’m off topic, but looking at the Postgres HA example code, wouldn’t it be more efficient to skip the write attempt to catch the PG read-only error? For example:

if FLY_REGION != PRIMARY_REGION {
  [replay the request in PRIMARY_REGION]
  return
}

[proceed with local write]

This code would replay all requests in the primary region.

However, detecting a potential write instead of relying on exceptions is smart. In the official Fly Ruby gem, POST requests - and requests that happen immediately after a write - are replayed before even hitting the application layer.

See https://github.com/superfly/fly-ruby/blob/main/lib/fly-ruby/regional_database.rb#L69-L75

Thanks Joshua,

I should have been specific about the code. It is in a function that is explicitely about to do a write.

Right - same principle there!

Is there a status update on Postgres database backup?

@user16 We perform daily snapshots “per volume” every 24 hours. Here’s some documentation that covers how to view your snapshots and how to perform a restore: Multi-region PostgreSQL .

Adding on to the theme of the production ready-ness of Fly postgres…is it possible to backup the snapshots to another provider like an AWS S3 bucket, in addition to the snapchats that Fly provides? Maybe I am paranoid but it just feels sketchy having my only database backups on one provider.

I’d strongly recommend GitHub - wal-g/wal-g: Archival and Restoration for Postgres - you could run this off a docker container on Fly, point it at your DB, and it’ll backup your WAL logs to S3 (and I think any other cloud provider of choice). This is a streaming backup as well, so it basically runs 24/7.

There’s also simple commands to restore DBs from the WAL backups in your bucket, as well as manual (you could cron schedule them too) snapshot and restore.

And these are WAL logs, so they’re the best right-up-to-the-edge of the crash backup system I know of.

3 Likes

This is very interesting! We are an elixir custom-app development agency and we are seriously considering adopting the fly.io platform for our services and for our customers.

Currently, the absence of a continuous backup and point-in-time-recovery strategy for the postgres offering is the only feature we really lack.

We are also considering using wal-g or pgbackrest but we don’t understand how we should do it: could you provide more details and maybe an example of use?

2 Likes

Shaun,

What might be the issue if following the guide that you link to and there are no volumes listed?

We have an operational database but fly volumes list <app-name> returns no volumes.

UPDATE I can list volumes if I use an -a flag, so fly volumes list -a <app-name>. So perhaps the docs are just out of date?

However, it shows just one volume, created 1 month ago. So that doesn’t appear to be benefitting from a ‘daily snapshot’ or is there something else I, or the documentation is missing?

Thanks

@adamwiggall Not sure how this was missed, but there’s a mount specification within the fly.toml file. I went ahead and removed it within the repo, you should try removing that locally as well.

A volume should not be required to run the migration.

@adamwiggall Thanks for catching that missing -a flag. I added it to the doc.

1 Like

Shaun,

Thanks for the swift response. I’m totally lost on what you are asking me to do though?

My fly.toml doesn’t appear to have a ‘mount specification’, and when you say ‘I went ahead and removed it from the repo’ I’m not sure what you mean?

My intention here was to be able to see snapshot(s), perhaps have the ability to archive them to a 3rd party service, and to feel comforted that if something goes awry I can get back up and running. Your mention of running a migration has me puzzled too.

Sorry for not grasping what you mean.

@adamwiggall I’m super sorry about that! I was balancing a couple conversations and looks like my wires crossed. Feel free to ignore my previous response…

So as you discovered, you can list the volumes with:

fly volumes list --app <app-name>

Then you can take the volume id and run the following to get a list of snapshots:

fly volumes snapshots list <volume-id>

Let me know what you see.

1 Like

Shaun,

Thank you very much my friend, I do now see the backups, that’s great.

I appreciate you coming back to me!

Adam

1 Like

Maybe relevant for this topic, as @moissela already asked this a while ago.

Just a short comment whether we should roll our own strategy or wait for something Fly is working on would be most helpful. Thanks in advance!

7 Likes

Bump! Is there an update on Database backups?

3 Likes

I am also curious about this, i am currently evaluating Fly and am wondering about strategies for PITR with Fly’s Postgres clusters.