Early look: PostgreSQL on Fly. We want your opinions.

Very cool, and it will add these replicas based on regions there are volumes created?

Also, was I correct in my findings regarding pricing on the PG apps? Being that there is at least one replica, are these billed for both the leader and the replica?

Is there a concept of connection limits on these postgres instances? Do you have a best practice for when to scale and how (vertically vs. horizontally). Also wondering if there is a best practice for handling paired app servers scaling up and using more connections/memory/cpu.

Yes, volumes “restrict” where instances can run, so it’ll always launch them where you have volumes created.

There’s no special pricing for PG apps, they’re just normal Fly apps. So you do pay for both VMs (or all three if you add another replica).

Postgres has its own connection limits, we’re using the default of 100 per instance.

As a really rough heuristic, I’d shoot for VM sizes with 1GB of memory per 10-20GB of Postgres data. This varies wildly depending on workload but it seems to be a reasonable guideline for most full stack apps. I wouldn’t add replicas to scale most apps, but I would add replicas to run in other regions.

Awesome, this is very helpful.

We will plan on scaling PG to regions as we spin up app vms in regions that are furthest from our initial PG region (ORD) :+1:t2:

Not sure if it was mentioned before but timescale 2.x would be amazing.

I saw that persistent volumes are not durable. Does this mean that the failure of a single disk will require a failover (including downtine, and data loss if you’re using async replication)? That is concerning.

Also “Read More” on https://fly.io/ links to https://fly.io/docs/reference/postgres/, which is a 404.

Yes and no. Disk durability is complex. For now, we want to set conservative expectations while we improve the volume infrastructure.

The volumes should be at least as reliable as an array of magnetic spinny disks.

You can also configure your postgres to use synchronous replication, which I think is worthwhile even with maximally redundant disk arrays. Back in our Compose days we’d see all kinds of unpredictable single node failures, drive corruption, etc. I do not trust single node databases for anything. :slight_smile:

(Link is fixed, btw).

I see that for each app that wants to connect to the database a pair of credentials is generated. And I’m assuming that you are using the vault for that. Are they being rotated behind the scenes? And have considered making the time-to-live for the secrets user configurable? Perhaps as an option with a sane default the first time we create the database.

We store the per app connection URL in vault as a secret (just like flyctl secrets set). We don’t rotate those credentials automatically, though. This is partially because we have been burned in the past when connection strings change without us knowing.

We could create a command to let you rotate credentials, though. :thinking:

+1 for timescale. I recall that being mentioned before. That’s a postgresql extension/plugin.

Also, noticed a tiny typo on https://fly.io/docs/reference/postgres/:

“To configure any App to be anle to use the Postgres App”

What I had in mind was like a new fly.toml entry to configure that. But I guess a command would be a great starting point to be honest. And one could automate that using something like Github Actions’s scheduled events + the fly github action. I think I would love that.

1 Like

Could someone from the team explain the use case / inner workings a bit more for Direct Connections on port 5433?

https://fly.io/docs/reference/postgres/#direct-connections

Thanks!

You can use 5433 to get a read-only connection to a replica. This is particularly neat when you have replicas + app instances running in regions outside the primary. @kurt made an app with global read replicas recently and might have more thoughts.

We’re also using 5433 to export pg for metrics to prometheus.

I forgot to post an update on this last week, but we’re now exporting metrics from postgres. You can see them on the Metrics tab in the UI as well as query from grafana.

Here’s how it looks in the UI

And here’s the available metrics (note these might change a bit):

pg_stat_activity_count
pg_stat_activity_max_tx_duration
pg_stat_archiver_archived_count
pg_stat_archiver_failed_count
pg_stat_bgwriter_buffers_alloc
pg_stat_bgwriter_buffers_backend_fsync
pg_stat_bgwriter_buffers_backend
pg_stat_bgwriter_buffers_checkpoint
pg_stat_bgwriter_buffers_clean
pg_stat_bgwriter_checkpoint_sync_time
pg_stat_bgwriter_checkpoint_write_time
pg_stat_bgwriter_checkpoints_req
pg_stat_bgwriter_checkpoints_timed
pg_stat_bgwriter_maxwritten_clean
pg_stat_bgwriter_stats_reset
pg_stat_database_blk_read_time
pg_stat_database_blk_write_time
pg_stat_database_blks_hit
pg_stat_database_blks_read
pg_stat_database_conflicts_confl_bufferpin
pg_stat_database_conflicts_confl_deadlock
pg_stat_database_conflicts_confl_lock
pg_stat_database_conflicts_confl_snapshot
pg_stat_database_conflicts_confl_tablespace
pg_stat_database_conflicts
pg_stat_database_deadlocks
pg_stat_database_numbackends
pg_stat_database_stats_reset
pg_stat_database_tup_deleted
pg_stat_database_tup_fetched
pg_stat_database_tup_inserted
pg_stat_database_tup_returned
pg_stat_database_tup_updated
pg_stat_database_xact_commit
pg_stat_database_xact_rollback
pg_stat_replication_pg_current_wal_lsn_bytes
pg_stat_replication_pg_wal_lsn_diff
pg_stat_replication_reply_time
pg_replication_lag
pg_database_size_bytes

Is the read only accessible from apps in regions where there is no PG replica or leader?

What exactly is 5433 connecting to within app VMs?

Yes. If you use the pg-app.internal hostname over the private network you’ll be connected to the closest available vm. If there’s an instance in the same region as the connecting app it’ll use that one.

We’re using stolon to manage postgres across a cluster. It provides a number of things, including a “keeper” that controls the postgres process and a “proxy” that always routes connections to the leader. 5433 is the port the keeper tells postgres to listen on, connecting there goes straight to postgres, though it might be the leader or the replica. Since most connections need writes, we made the proxy use the default 5432 port so clients behave as expected.

This is all necessary so clients can connect to the leader without knowing which node it’s on, which is critical for HA. If the leader fails, the proxy drops all connections until a new leader is elected. Once it’s ready, new connections go to the new leader without rebooting clients or changing their config.

If you’ve ever received a late night email from Heroku saying your DB was replaced, you know why this is awesome.

2 Likes

Thanks so much for the detailed explanation. Sounds like we should just use 5432 in all our app servers.

Im still intrigued to know when it’s the “correct” time to use 5433 vs 5432 :slight_smile:

If I want to move the region of my database cluster could I use fly regions set on that app or do I need to do something more complicated?

It’s a little more complicated than that, but doable! You need to add replicas in the new region:

  1. Create two new volumes in the region you want to move to:
    1. flyctl volumes create pg_data --size <gb> --region <new_region>
    2. flyctl volumes create pg_data --size <gb> --region <new_region>
  2. Scale to 4 instances: flyctl scale count 4
  3. flyctl regions add <new_region>

If you run flyctl status now, you should see 4 instances running (or two new ones booting up).

The next step is to change the primary_region setting. For this, you’ll actually need to pull the configuration down and edit it.

  1. Create a new directly, cd into it
  2. Run flyctl config save -a <postgres-cluster-name>

This gives you a fly.toml, you will need to make a couple of edits:

  1. Edit fly.toml and change the PRIMARY_REGION environment variable
  2. Make sure the experimental block looks like this:
    [experimental]
    private_network = true
    enable_consul   = true
    auto_rollback   = false
    metrics_port    = 9187
    metrics_path    = "/metrics"```
    
    

That last step is a pain, sorry about that.

Once you have the fly.toml setup, run flyctl deploy -i flyio/postgres-ha.

This is something that we hope to make simpler!

2 Likes

This is great! If I want more space after the database is created can I just scale the volume - and if so, how?

Adding space is a similar process as adding regions. Just add new volumes of the size you want, flyctl scale count 4, delete the old volumes (you can do this while VMs are running), then flyctl scale count 2.

1 Like