Very cool, and it will add these replicas based on regions there are volumes created?
Also, was I correct in my findings regarding pricing on the PG apps? Being that there is at least one replica, are these billed for both the leader and the replica?
Is there a concept of connection limits on these postgres instances? Do you have a best practice for when to scale and how (vertically vs. horizontally). Also wondering if there is a best practice for handling paired app servers scaling up and using more connections/memory/cpu.
Yes, volumes “restrict” where instances can run, so it’ll always launch them where you have volumes created.
There’s no special pricing for PG apps, they’re just normal Fly apps. So you do pay for both VMs (or all three if you add another replica).
Postgres has its own connection limits, we’re using the default of 100 per instance.
As a really rough heuristic, I’d shoot for VM sizes with 1GB of memory per 10-20GB of Postgres data. This varies wildly depending on workload but it seems to be a reasonable guideline for most full stack apps. I wouldn’t add replicas to scale most apps, but I would add replicas to run in other regions.
Awesome, this is very helpful.
We will plan on scaling PG to regions as we spin up app vms in regions that are furthest from our initial PG region (ORD)
Not sure if it was mentioned before but timescale 2.x would be amazing.
I saw that persistent volumes are not durable. Does this mean that the failure of a single disk will require a failover (including downtine, and data loss if you’re using async replication)? That is concerning.
Also “Read More” on https://fly.io/ links to https://fly.io/docs/reference/postgres/, which is a 404.
Yes and no. Disk durability is complex. For now, we want to set conservative expectations while we improve the volume infrastructure.
The volumes should be at least as reliable as an array of magnetic spinny disks.
You can also configure your postgres to use synchronous replication, which I think is worthwhile even with maximally redundant disk arrays. Back in our Compose days we’d see all kinds of unpredictable single node failures, drive corruption, etc. I do not trust single node databases for anything.
(Link is fixed, btw).
I see that for each app that wants to connect to the database a pair of credentials is generated. And I’m assuming that you are using the vault for that. Are they being rotated behind the scenes? And have considered making the time-to-live for the secrets user configurable? Perhaps as an option with a sane default the first time we create the database.
We store the per app connection URL in vault as a secret (just like
flyctl secrets set). We don’t rotate those credentials automatically, though. This is partially because we have been burned in the past when connection strings change without us knowing.
We could create a command to let you rotate credentials, though.
+1 for timescale. I recall that being mentioned before. That’s a postgresql extension/plugin.
Also, noticed a tiny typo on https://fly.io/docs/reference/postgres/:
“To configure any App to be anle to use the Postgres App”
What I had in mind was like a new fly.toml entry to configure that. But I guess a command would be a great starting point to be honest. And one could automate that using something like Github Actions’s scheduled events + the fly github action. I think I would love that.
Could someone from the team explain the use case / inner workings a bit more for Direct Connections on port 5433?
You can use 5433 to get a read-only connection to a replica. This is particularly neat when you have replicas + app instances running in regions outside the primary. @kurt made an app with global read replicas recently and might have more thoughts.
We’re also using 5433 to export pg for metrics to prometheus.
I forgot to post an update on this last week, but we’re now exporting metrics from postgres. You can see them on the Metrics tab in the UI as well as query from grafana.
Here’s how it looks in the UI
And here’s the available metrics (note these might change a bit):
Is the read only accessible from apps in regions where there is no PG replica or leader?
What exactly is 5433 connecting to within app VMs?
Yes. If you use the
pg-app.internal hostname over the private network you’ll be connected to the closest available vm. If there’s an instance in the same region as the connecting app it’ll use that one.
We’re using stolon to manage postgres across a cluster. It provides a number of things, including a “keeper” that controls the postgres process and a “proxy” that always routes connections to the leader. 5433 is the port the keeper tells postgres to listen on, connecting there goes straight to postgres, though it might be the leader or the replica. Since most connections need writes, we made the proxy use the default 5432 port so clients behave as expected.
This is all necessary so clients can connect to the leader without knowing which node it’s on, which is critical for HA. If the leader fails, the proxy drops all connections until a new leader is elected. Once it’s ready, new connections go to the new leader without rebooting clients or changing their config.
If you’ve ever received a late night email from Heroku saying your DB was replaced, you know why this is awesome.
Thanks so much for the detailed explanation. Sounds like we should just use 5432 in all our app servers.
Im still intrigued to know when it’s the “correct” time to use 5433 vs 5432
If I want to move the region of my database cluster could I use
fly regions set on that app or do I need to do something more complicated?
It’s a little more complicated than that, but doable! You need to add replicas in the new region:
- Create two new volumes in the region you want to move to:
flyctl volumes create pg_data --size <gb> --region <new_region>
flyctl volumes create pg_data --size <gb> --region <new_region>
- Scale to 4 instances:
flyctl scale count 4
flyctl regions add <new_region>
If you run
flyctl status now, you should see 4 instances running (or two new ones booting up).
The next step is to change the
primary_region setting. For this, you’ll actually need to pull the configuration down and edit it.
- Create a new directly,
cd into it
flyctl config save -a <postgres-cluster-name>
This gives you a
fly.toml, you will need to make a couple of edits:
fly.toml and change the
PRIMARY_REGION environment variable
- Make sure the
experimental block looks like this:
private_network = true
enable_consul = true
auto_rollback = false
metrics_port = 9187
metrics_path = "/metrics"```
That last step is a pain, sorry about that.
Once you have the
fly.toml setup, run
flyctl deploy -i flyio/postgres-ha.
This is something that we hope to make simpler!
This is great! If I want more space after the database is created can I just scale the volume - and if so, how?
Adding space is a similar process as adding regions. Just add new volumes of the size you want,
flyctl scale count 4, delete the old volumes (you can do this while VMs are running), then
flyctl scale count 2.