Very cool, and it will add these replicas based on regions there are volumes created?
Also, was I correct in my findings regarding pricing on the PG apps? Being that there is at least one replica, are these billed for both the leader and the replica?
Is there a concept of connection limits on these postgres instances? Do you have a best practice for when to scale and how (vertically vs. horizontally). Also wondering if there is a best practice for handling paired app servers scaling up and using more connections/memory/cpu.
Yes, volumes “restrict” where instances can run, so it’ll always launch them where you have volumes created.
There’s no special pricing for PG apps, they’re just normal Fly apps. So you do pay for both VMs (or all three if you add another replica).
Postgres has its own connection limits, we’re using the default of 100 per instance.
As a really rough heuristic, I’d shoot for VM sizes with 1GB of memory per 10-20GB of Postgres data. This varies wildly depending on workload but it seems to be a reasonable guideline for most full stack apps. I wouldn’t add replicas to scale most apps, but I would add replicas to run in other regions.
I saw that persistent volumes are not durable. Does this mean that the failure of a single disk will require a failover (including downtine, and data loss if you’re using async replication)? That is concerning.
Yes and no. Disk durability is complex. For now, we want to set conservative expectations while we improve the volume infrastructure.
The volumes should be at least as reliable as an array of magnetic spinny disks.
You can also configure your postgres to use synchronous replication, which I think is worthwhile even with maximally redundant disk arrays. Back in our Compose days we’d see all kinds of unpredictable single node failures, drive corruption, etc. I do not trust single node databases for anything.
I see that for each app that wants to connect to the database a pair of credentials is generated. And I’m assuming that you are using the vault for that. Are they being rotated behind the scenes? And have considered making the time-to-live for the secrets user configurable? Perhaps as an option with a sane default the first time we create the database.
We store the per app connection URL in vault as a secret (just like flyctl secrets set). We don’t rotate those credentials automatically, though. This is partially because we have been burned in the past when connection strings change without us knowing.
We could create a command to let you rotate credentials, though.
What I had in mind was like a new fly.toml entry to configure that. But I guess a command would be a great starting point to be honest. And one could automate that using something like Github Actions’s scheduled events + the fly github action. I think I would love that.
You can use 5433 to get a read-only connection to a replica. This is particularly neat when you have replicas + app instances running in regions outside the primary. @kurt made an app with global read replicas recently and might have more thoughts.
We’re also using 5433 to export pg for metrics to prometheus.
I forgot to post an update on this last week, but we’re now exporting metrics from postgres. You can see them on the Metrics tab in the UI as well as query from grafana.
Yes. If you use the pg-app.internal hostname over the private network you’ll be connected to the closest available vm. If there’s an instance in the same region as the connecting app it’ll use that one.
We’re using stolon to manage postgres across a cluster. It provides a number of things, including a “keeper” that controls the postgres process and a “proxy” that always routes connections to the leader. 5433 is the port the keeper tells postgres to listen on, connecting there goes straight to postgres, though it might be the leader or the replica. Since most connections need writes, we made the proxy use the default 5432 port so clients behave as expected.
This is all necessary so clients can connect to the leader without knowing which node it’s on, which is critical for HA. If the leader fails, the proxy drops all connections until a new leader is elected. Once it’s ready, new connections go to the new leader without rebooting clients or changing their config.
If you’ve ever received a late night email from Heroku saying your DB was replaced, you know why this is awesome.
Adding space is a similar process as adding regions. Just add new volumes of the size you want, flyctl scale count 4, delete the old volumes (you can do this while VMs are running), then flyctl scale count 2.