Awesome, this is very helpful.
We will plan on scaling PG to regions as we spin up app vms in regions that are furthest from our initial PG region (ORD)
Awesome, this is very helpful.
We will plan on scaling PG to regions as we spin up app vms in regions that are furthest from our initial PG region (ORD)
Not sure if it was mentioned before but timescale 2.x would be amazing.
I saw that persistent volumes are not durable. Does this mean that the failure of a single disk will require a failover (including downtine, and data loss if you’re using async replication)? That is concerning.
Also “Read More” on https://fly.io/ links to https://fly.io/docs/reference/postgres/, which is a 404.
Yes and no. Disk durability is complex. For now, we want to set conservative expectations while we improve the volume infrastructure.
The volumes should be at least as reliable as an array of magnetic spinny disks.
You can also configure your postgres to use synchronous replication, which I think is worthwhile even with maximally redundant disk arrays. Back in our Compose days we’d see all kinds of unpredictable single node failures, drive corruption, etc. I do not trust single node databases for anything.
(Link is fixed, btw).
I see that for each app that wants to connect to the database a pair of credentials is generated. And I’m assuming that you are using the vault for that. Are they being rotated behind the scenes? And have considered making the time-to-live for the secrets user configurable? Perhaps as an option with a sane default the first time we create the database.
We store the per app connection URL in vault as a secret (just like flyctl secrets set
). We don’t rotate those credentials automatically, though. This is partially because we have been burned in the past when connection strings change without us knowing.
We could create a command to let you rotate credentials, though.
+1 for timescale. I recall that being mentioned before. That’s a postgresql extension/plugin.
Also, noticed a tiny typo on https://fly.io/docs/reference/postgres/:
“To configure any App to be anle to use the Postgres App”
What I had in mind was like a new fly.toml entry to configure that. But I guess a command would be a great starting point to be honest. And one could automate that using something like Github Actions’s scheduled events + the fly github action. I think I would love that.
Could someone from the team explain the use case / inner workings a bit more for Direct Connections on port 5433?
https://fly.io/docs/reference/postgres/#direct-connections
Thanks!
You can use 5433 to get a read-only connection to a replica. This is particularly neat when you have replicas + app instances running in regions outside the primary. @kurt made an app with global read replicas recently and might have more thoughts.
We’re also using 5433 to export pg for metrics to prometheus.
I forgot to post an update on this last week, but we’re now exporting metrics from postgres. You can see them on the Metrics tab in the UI as well as query from grafana.
Here’s how it looks in the UI
And here’s the available metrics (note these might change a bit):
pg_stat_activity_count
pg_stat_activity_max_tx_duration
pg_stat_archiver_archived_count
pg_stat_archiver_failed_count
pg_stat_bgwriter_buffers_alloc
pg_stat_bgwriter_buffers_backend_fsync
pg_stat_bgwriter_buffers_backend
pg_stat_bgwriter_buffers_checkpoint
pg_stat_bgwriter_buffers_clean
pg_stat_bgwriter_checkpoint_sync_time
pg_stat_bgwriter_checkpoint_write_time
pg_stat_bgwriter_checkpoints_req
pg_stat_bgwriter_checkpoints_timed
pg_stat_bgwriter_maxwritten_clean
pg_stat_bgwriter_stats_reset
pg_stat_database_blk_read_time
pg_stat_database_blk_write_time
pg_stat_database_blks_hit
pg_stat_database_blks_read
pg_stat_database_conflicts_confl_bufferpin
pg_stat_database_conflicts_confl_deadlock
pg_stat_database_conflicts_confl_lock
pg_stat_database_conflicts_confl_snapshot
pg_stat_database_conflicts_confl_tablespace
pg_stat_database_conflicts
pg_stat_database_deadlocks
pg_stat_database_numbackends
pg_stat_database_stats_reset
pg_stat_database_tup_deleted
pg_stat_database_tup_fetched
pg_stat_database_tup_inserted
pg_stat_database_tup_returned
pg_stat_database_tup_updated
pg_stat_database_xact_commit
pg_stat_database_xact_rollback
pg_stat_replication_pg_current_wal_lsn_bytes
pg_stat_replication_pg_wal_lsn_diff
pg_stat_replication_reply_time
pg_replication_lag
pg_database_size_bytes
Is the read only accessible from apps in regions where there is no PG replica or leader?
What exactly is 5433 connecting to within app VMs?
Yes. If you use the pg-app.internal
hostname over the private network you’ll be connected to the closest available vm. If there’s an instance in the same region as the connecting app it’ll use that one.
We’re using stolon to manage postgres across a cluster. It provides a number of things, including a “keeper” that controls the postgres process and a “proxy” that always routes connections to the leader. 5433 is the port the keeper tells postgres to listen on, connecting there goes straight to postgres, though it might be the leader or the replica. Since most connections need writes, we made the proxy use the default 5432 port so clients behave as expected.
This is all necessary so clients can connect to the leader without knowing which node it’s on, which is critical for HA. If the leader fails, the proxy drops all connections until a new leader is elected. Once it’s ready, new connections go to the new leader without rebooting clients or changing their config.
If you’ve ever received a late night email from Heroku saying your DB was replaced, you know why this is awesome.
Thanks so much for the detailed explanation. Sounds like we should just use 5432 in all our app servers.
Im still intrigued to know when it’s the “correct” time to use 5433 vs 5432
If I want to move the region of my database cluster could I use fly regions set
on that app or do I need to do something more complicated?
It’s a little more complicated than that, but doable! You need to add replicas in the new region:
flyctl volumes create pg_data --size <gb> --region <new_region>
flyctl volumes create pg_data --size <gb> --region <new_region>
flyctl scale count 4
flyctl regions add <new_region>
If you run flyctl status
now, you should see 4 instances running (or two new ones booting up).
The next step is to change the primary_region
setting. For this, you’ll actually need to pull the configuration down and edit it.
cd
into itflyctl config save -a <postgres-cluster-name>
This gives you a fly.toml
, you will need to make a couple of edits:
fly.toml
and change the PRIMARY_REGION
environment variableexperimental
block looks like this:[experimental]
private_network = true
enable_consul = true
auto_rollback = false
metrics_port = 9187
metrics_path = "/metrics"```
That last step is a pain, sorry about that.
Once you have the fly.toml
setup, run flyctl deploy -i flyio/postgres-ha
.
This is something that we hope to make simpler!
This is great! If I want more space after the database is created can I just scale the volume - and if so, how?
Adding space is a similar process as adding regions. Just add new volumes of the size you want, flyctl scale count 4
, delete the old volumes (you can do this while VMs are running), then flyctl scale count 2
.
I’m new to fly and having some issues with regards to connections (I’m using Windows).
I’ve set up the default postgres instances but I’m having difficulty with regards to the host address (while connected with Wireguard).
I finally was able to connect with psql shell but I had to use one of the app instance addresses to connect. APP_NAME.internal doesn’t translate the host - it gives the error ‘unknown host’. As per the example in the docs “psql postgres://postgres:secret123@appname.internal:5432” doesn’t work for me.
I’ve tested with the privatenet example on github and appinstances are correctly displayed.
Am I missing something?
Thanks
Steve
As an update, for some reason I can now connect through the .internal address. I’m not sure if it is due to time (just under an hour since inception of the app) or because I used SSH to connect to one of the instances. I’m quite confused.