Early look: PostgreSQL on Fly. We want your opinions.

danwetherald · February 18, 2021, 1:10am

Awesome, this is very helpful.

We will plan on scaling PG to regions as we spin up app vms in regions that are furthest from our initial PG region (ORD)

Jens · February 18, 2021, 6:32am

Not sure if it was mentioned before but timescale 2.x would be amazing.

gavinwahl · February 18, 2021, 11:42pm

I saw that persistent volumes are not durable. Does this mean that the failure of a single disk will require a failover (including downtine, and data loss if you’re using async replication)? That is concerning.

Also “Read More” on https://fly.io/ links to https://fly.io/docs/reference/postgres/, which is a 404.

kurt · February 19, 2021, 12:00am

Yes and no. Disk durability is complex. For now, we want to set conservative expectations while we improve the volume infrastructure.

The volumes should be at least as reliable as an array of magnetic spinny disks.

You can also configure your postgres to use synchronous replication, which I think is worthwhile even with maximally redundant disk arrays. Back in our Compose days we’d see all kinds of unpredictable single node failures, drive corruption, etc. I do not trust single node databases for anything.

(Link is fixed, btw).

rugwiro · February 19, 2021, 8:34am

I see that for each app that wants to connect to the database a pair of credentials is generated. And I’m assuming that you are using the vault for that. Are they being rotated behind the scenes? And have considered making the time-to-live for the secrets user configurable? Perhaps as an option with a sane default the first time we create the database.

kurt · February 19, 2021, 3:37pm

We store the per app connection URL in vault as a secret (just like flyctl secrets set). We don’t rotate those credentials automatically, though. This is partially because we have been burned in the past when connection strings change without us knowing.

We could create a command to let you rotate credentials, though.

greg · February 19, 2021, 4:20pm

+1 for timescale. I recall that being mentioned before. That’s a postgresql extension/plugin.

Also, noticed a tiny typo on https://fly.io/docs/reference/postgres/:

“To configure any App to be anle to use the Postgres App”

rugwiro · February 19, 2021, 6:12pm

What I had in mind was like a new fly.toml entry to configure that. But I guess a command would be a great starting point to be honest. And one could automate that using something like Github Actions’s scheduled events + the fly github action. I think I would love that.

danwetherald · February 19, 2021, 6:21pm

Could someone from the team explain the use case / inner workings a bit more for Direct Connections on port 5433?

https://fly.io/docs/reference/postgres/#direct-connections

Thanks!

michael · February 19, 2021, 6:38pm

You can use 5433 to get a read-only connection to a replica. This is particularly neat when you have replicas + app instances running in regions outside the primary. @kurt made an app with global read replicas recently and might have more thoughts.

We’re also using 5433 to export pg for metrics to prometheus.

michael · February 19, 2021, 6:44pm

I forgot to post an update on this last week, but we’re now exporting metrics from postgres. You can see them on the Metrics tab in the UI as well as query from grafana.

Here’s how it looks in the UI

And here’s the available metrics (note these might change a bit):

pg_stat_activity_count
pg_stat_activity_max_tx_duration
pg_stat_archiver_archived_count
pg_stat_archiver_failed_count
pg_stat_bgwriter_buffers_alloc
pg_stat_bgwriter_buffers_backend_fsync
pg_stat_bgwriter_buffers_backend
pg_stat_bgwriter_buffers_checkpoint
pg_stat_bgwriter_buffers_clean
pg_stat_bgwriter_checkpoint_sync_time
pg_stat_bgwriter_checkpoint_write_time
pg_stat_bgwriter_checkpoints_req
pg_stat_bgwriter_checkpoints_timed
pg_stat_bgwriter_maxwritten_clean
pg_stat_bgwriter_stats_reset
pg_stat_database_blk_read_time
pg_stat_database_blk_write_time
pg_stat_database_blks_hit
pg_stat_database_blks_read
pg_stat_database_conflicts_confl_bufferpin
pg_stat_database_conflicts_confl_deadlock
pg_stat_database_conflicts_confl_lock
pg_stat_database_conflicts_confl_snapshot
pg_stat_database_conflicts_confl_tablespace
pg_stat_database_conflicts
pg_stat_database_deadlocks
pg_stat_database_numbackends
pg_stat_database_stats_reset
pg_stat_database_tup_deleted
pg_stat_database_tup_fetched
pg_stat_database_tup_inserted
pg_stat_database_tup_returned
pg_stat_database_tup_updated
pg_stat_database_xact_commit
pg_stat_database_xact_rollback
pg_stat_replication_pg_current_wal_lsn_bytes
pg_stat_replication_pg_wal_lsn_diff
pg_stat_replication_reply_time
pg_replication_lag
pg_database_size_bytes

danwetherald · February 19, 2021, 8:09pm

Is the read only accessible from apps in regions where there is no PG replica or leader?

What exactly is 5433 connecting to within app VMs?

michael · February 19, 2021, 8:50pm

Yes. If you use the pg-app.internal hostname over the private network you’ll be connected to the closest available vm. If there’s an instance in the same region as the connecting app it’ll use that one.

We’re using stolon to manage postgres across a cluster. It provides a number of things, including a “keeper” that controls the postgres process and a “proxy” that always routes connections to the leader. 5433 is the port the keeper tells postgres to listen on, connecting there goes straight to postgres, though it might be the leader or the replica. Since most connections need writes, we made the proxy use the default 5432 port so clients behave as expected.

This is all necessary so clients can connect to the leader without knowing which node it’s on, which is critical for HA. If the leader fails, the proxy drops all connections until a new leader is elected. Once it’s ready, new connections go to the new leader without rebooting clients or changing their config.

If you’ve ever received a late night email from Heroku saying your DB was replaced, you know why this is awesome.

danwetherald · February 19, 2021, 8:57pm

Thanks so much for the detailed explanation. Sounds like we should just use 5432 in all our app servers.

Im still intrigued to know when it’s the “correct” time to use 5433 vs 5432

max · February 22, 2021, 1:34am

If I want to move the region of my database cluster could I use fly regions set on that app or do I need to do something more complicated?

kurt · February 22, 2021, 1:46am

It’s a little more complicated than that, but doable! You need to add replicas in the new region:

Create two new volumes in the region you want to move to:
1. flyctl volumes create pg_data --size <gb> --region <new_region>
2. flyctl volumes create pg_data --size <gb> --region <new_region>
Scale to 4 instances: flyctl scale count 4
flyctl regions add <new_region>

If you run flyctl status now, you should see 4 instances running (or two new ones booting up).

The next step is to change the primary_region setting. For this, you’ll actually need to pull the configuration down and edit it.

Create a new directly, cd into it
Run flyctl config save -a <postgres-cluster-name>

This gives you a fly.toml, you will need to make a couple of edits:

Edit fly.toml and change the PRIMARY_REGION environment variable

Make sure the experimental block looks like this:

[experimental]
private_network = true
enable_consul   = true
auto_rollback   = false
metrics_port    = 9187
metrics_path    = "/metrics"```

That last step is a pain, sorry about that.

Once you have the fly.toml setup, run flyctl deploy -i flyio/postgres-ha.

This is something that we hope to make simpler!

adam · February 25, 2021, 4:39pm

This is great! If I want more space after the database is created can I just scale the volume - and if so, how?

kurt · February 25, 2021, 4:45pm

Adding space is a similar process as adding regions. Just add new volumes of the size you want, flyctl scale count 4, delete the old volumes (you can do this while VMs are running), then flyctl scale count 2.

steve-i · March 2, 2021, 12:30pm

I’m new to fly and having some issues with regards to connections (I’m using Windows).

I’ve set up the default postgres instances but I’m having difficulty with regards to the host address (while connected with Wireguard).

I finally was able to connect with psql shell but I had to use one of the app instance addresses to connect. APP_NAME.internal doesn’t translate the host - it gives the error ‘unknown host’. As per the example in the docs “psql postgres://postgres:secret123@appname.internal:5432” doesn’t work for me.

I’ve tested with the privatenet example on github and appinstances are correctly displayed.

Am I missing something?

Thanks
Steve

steve-i · March 2, 2021, 1:32pm

As an update, for some reason I can now connect through the .internal address. I’m not sure if it is due to time (just under an hour since inception of the app) or because I used SSH to connect to one of the instances. I’m quite confused.

Topic		Replies	Views
Supabase Managed Postgres in private alpha Fresh Produce postgres	50	4130	May 24, 2024
Fly Managed MySQL Private Beta Fresh Produce mysql	34	2321	May 13, 2025
managed postgres Questions / Help postgres	8	3424	February 25, 2025
Question about pricing for (Postgres, Hasura and Remix) Questions / Help	6	850	July 9, 2022
Fly With a Managed Database?	14	1741	January 23, 2024

Early look: PostgreSQL on Fly. We want your opinions.

Related topics