I have a db cluster, 2 VMs with 40GB volumes each. Needless to say, this is way too much. I want to downsize to 1GB each. I tried the instructions I found here, but the command
fly scale count 2 -a <pgname> returns the error
Error: this app has no complete releases. Run fly deploy
to create one and rerun this command.
fly deploy doesn’t fix the issue at all.
fly status -a <pgname> returns:
ID STATE ROLE REGION CHECKS IMAGE CREATED UPDATED
3d8d9265a0e058 started replica sea 3 total, 3 passing flyio/postgres-flex:15.3 (v0.0.45) 2023-10-14T08:38:12Z 2023-11-01T02:20:12Z
17811ed3b29218 started primary sea 3 total, 3 passing flyio/postgres-flex:15.3 (v0.0.45) 2023-10-14T08:37:49Z 2023-11-01T02:21:09Z
Are there any updated instructions on how to do this? I’m incurring costs that are uncalled for.
I appreciate any support you may provide.
The way you scale up and down to create new units in the cluster has changed a bit with the Machines (V2) platform. Here are updated instructions. Please be aware I tested this on a freshly-created database with very little data. So please please do NOT skip step 1.
- BACKUP YOUR DATA. Do not skip this step. There are snapshots in case things go wrong. Your backup will help if things go very wrong. You can use
fly proxy 5432 -a <pgname> and then
pg_dump to grab an SQL dump of your data.
- Make sure you did not skip step 1.
- Add a new 1GB volume to your Postgres app:
fly volumes create pg_data --size 1 --region <your-pg-region> -a <pgname>. Make a note of the VOLUME_ID.
fly status -a <pgname>, note which instance is the primary and how many instances are running. Grab the ID (first column) of an existing machine.
- Add one more instance as a clone of that machine, attaching it to the previously-created volume:
fly machine clone EXISTING_MACHINE_ID --attach-volume VOLUME_ID -a <pgname>. Note the machine ID of this new SMALL_MACHINE_ID.
- Clone that machine two more times:
fly machines clone SMALL_MACHINE_ID -a <pgname>.
- Wait a few minutes, then run
fly status to make sure the new instances are in replica state and happy/healthy. Confirm using
fly volumes list and
fly machines list that the new machines have the small volumes.
- Stop the old old replicas with large volumes. The easiest way is to run
fly volumes list -a <pgname>, check the ATTACHED VM column for machines with the large volumes, ensure they are not the primary, and
fly machine stop those. It’s OK to stop them almost at the same time. Once done, wait for the cluster status to be stable and for the
fly logs to be free of errors.
- Stop the primary,
fly machine stop PRIMARY_ID.
- Wait a few minutes for the three remaining replicas to realize the primary is gone, chat among themselves and elect a new primary node.
- Once your three small nodes are the only remaining ones, and one of them has been designated the primary, it’s safe to
fly machine destroy for the old replicas and the old primary.
fly volumes list, identify and
fly volumes destroy all the old, now-unattached, large volumes.
- If anything goes wrong, your cluster is down, or in read-only mode and it’s not obvious how to recover it: Luckily, you have a backup! Nuke the cluster, create a new one with the desired size, and load your data in it. Or you can look into how to restore from a snapshot - but snapshots are taken daily and your backup is only 20 minutes old, isn’t it?
Let me know if this works!
I’m very very grateful for you taking the time to answer my question. I owe you a debt of gratitude and appreciate your support immensely.
I was able to carry out everything and I now have three 1GB volumes attached to three small machines and the site is up and running perfectly.
PS: for anyone attempting to do this, you should know that your website will be down, for not more than 3 minutes, and then right back up.