I have a db cluster, 2 VMs with 40GB volumes each. Needless to say, this is way too much. I want to downsize to 1GB each. I tried the instructions I found here, but the command fly scale count 2 -a <pgname> returns the error Error: this app has no complete releases. Run fly deploy to create one and rerun this command. fly deploy doesn’t fix the issue at all.
fly status -a <pgname> returns:
ID STATE ROLE REGION CHECKS IMAGE CREATED UPDATED
3d8d9265a0e058 started replica sea 3 total, 3 passing flyio/postgres-flex:15.3 (v0.0.45) 2023-10-14T08:38:12Z 2023-11-01T02:20:12Z
17811ed3b29218 started primary sea 3 total, 3 passing flyio/postgres-flex:15.3 (v0.0.45) 2023-10-14T08:37:49Z 2023-11-01T02:21:09Z
Are there any updated instructions on how to do this? I’m incurring costs that are uncalled for.
The way you scale up and down to create new units in the cluster has changed a bit with the Machines (V2) platform. Here are updated instructions. Please be aware I tested this on a freshly-created database with very little data. So please please do NOT skip step 1.
BACKUP YOUR DATA. Do not skip this step. There are snapshots in case things go wrong. Your backup will help if things go very wrong. You can use fly proxy 5432 -a <pgname> and then pg_dump to grab an SQL dump of your data.
Make sure you did not skip step 1.
Add a new 1GB volume to your Postgres app: fly volumes create pg_data --size 1 --region <your-pg-region> -a <pgname>. Make a note of the VOLUME_ID.
Run fly status -a <pgname>, note which instance is the primary and how many instances are running. Grab the ID (first column) of an existing machine.
Add one more instance as a clone of that machine, attaching it to the previously-created volume: fly machine clone EXISTING_MACHINE_ID --attach-volume VOLUME_ID -a <pgname>. Note the machine ID of this new SMALL_MACHINE_ID.
Clone that machine two more times: fly machines clone SMALL_MACHINE_ID -a <pgname>.
Wait a few minutes, then run fly status to make sure the new instances are in replica state and happy/healthy. Confirm using fly volumes list and fly machines list that the new machines have the small volumes.
Stop the old old replicas with large volumes. The easiest way is to run fly volumes list -a <pgname>, check the ATTACHED VM column for machines with the large volumes, ensure they are not the primary, and fly machine stop those. It’s OK to stop them almost at the same time. Once done, wait for the cluster status to be stable and for the fly logs to be free of errors.
Stop the primary, fly machine stop PRIMARY_ID.
Wait a few minutes for the three remaining replicas to realize the primary is gone, chat among themselves and elect a new primary node.
Once your three small nodes are the only remaining ones, and one of them has been designated the primary, it’s safe to fly machine destroy for the old replicas and the old primary.
In fly volumes list, identify and fly volumes destroy all the old, now-unattached, large volumes.
If anything goes wrong, your cluster is down, or in read-only mode and it’s not obvious how to recover it: Luckily, you have a backup! Nuke the cluster, create a new one with the desired size, and load your data in it. Or you can look into how to restore from a snapshot - but snapshots are taken daily and your backup is only 20 minutes old, isn’t it?