How to scale volumes of postgres leader?

I’m trying to scale the volumes associated with a Fly-managed postgres. I followed the instructions in this community post to scale up the replicas: PostgreSQL HA - #2 by kurt

This seems to work with replicas, but attempting these steps with the leader, (create volume, scale up, remove leader volume, scale down) the leader vm doesn’t get removed when scaling down, it instead removes the newly-created replica, and leaves the leader with no volume associated with it. I was able to get it working by restarting postgres, but that resulted in downtime.

What is the intended way to scale the volumes associated with the postgres leader?

2 Likes

There are some race conditions in volume removal and scaling that we need to work through. But it will probably work better to do this:

  1. create new volume
  2. delete old replica volume
  3. wait a few minutes
  4. fly vm stop <replica id>

Once the new replica shows healthy, you can repeat the same process for the leader. fly vm stop <leader id> will gracefully step the leader down with minimal downtime.

Thanks @kurt!
But I’m still struggling with the warnings raised in PostgreSQL HA

I’d appreciate some help here :bowing_man: :hugs:

@kurt That workflow appears to work better, although the downtime doesn’t seems super great when it happens. My app is seeing ~1 minute of downtime when transitioning. Not sure if that’s expected or not. It could also be something to do with the configuration on my end?

Yeah that’s poor. I just realized it’s probably the result of an IP address in <db-name>.internal going away. Some drivers accept a connection string with multiple addresses and will happily reconnect, but we haven’t been able to find a consistent way to expose that yet.

What app runtime / postgres driver are you using? We can probably get you a config that helps with this.

I’ve been testing with Prisma, which takes connection URLs: PostgreSQL database connector (Reference) | Prisma Docs

Hmm that might work with the multi host URL format, assuming they really did implement libpq. Give us some time to try and improve this, if you find yourself needing to resize again soon we can do it for you.