Fly Postgres auto stop/start issues

Hey all,

I’m having some issues with my database deployed on Fly.io. I’m trying to setup auto start/stop on my application to save on costs during downtime. Its a live streaming application so has quite heavy usage between long periods of inactivity, so auto stop/start would be perfect. I have created the app with fly launch and the database app is created using fly postgres create and attaching to the main app. The app and database connection works fine when the database is already up.

Both my app and app-db are stopping correctly, my app is starting correctly too however when the application connects to the database string only the replica machine in the primary region starts, this replica machine’s volume does not appear to have any of my tables and so the main application fails.

If i reduce the app-db to only one machine in the primary region then auto start/stop works fine as expected.

I have two questions really:

  1. How do i both ensure that my application connects to the correct machine , if thats even a consideration i need to make.
  2. How do i ensure that my replica is up to date with my primary database volume and writes in my application update all the volumes?

Many thanks!
Tom.

Hi Tom. You’ve asked two good questions, and designing a solution that answers them is challenging! For instance, whenever the primary is started, any replicas should also be started. Otherwise, as you’ve pointed out, the replicas may get out of date.

The short answer is that Fly Postgres’s replication/HA and the tools it’s built on (most notably repmgr) aren’t trying to solve these autostart/autostop-related problems. As far as I know, they assume a traditional environment where database nodes are running at all times. So while I certainly understand the appeal of autostarting/autostopping the database, I don’t expect a replicated setup like this to work well with it.

You may be interested in looking into some fully managed database providers. Some of them may offer autoscaling/scale-to-zero features that may (or may not) make them a better choice given your usage patterns.

(This is likely not what you want, but it’s worth noting that with Fly PG you can enable scale-to-zero for single-node development deployments without replication.)

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.