Fly Postgres Backup Tweaks

mcass19 · November 26, 2025, 8:02pm

Hi community,

I’m running an unmanaged Fly Postgres cluster (2 machines: primary + replica) with backups to Tigris enabled.

I have two questions about backup configuration:

1. Can I configure what time of day backups run?

Currently, backups run around midday, and I’d prefer to schedule them during off-peak hours (e.g., 2-3 AM UTC).

I’ve explored fly postgres backup config update, which allows setting:

--full-backup-frequency (24h)
--archive-timeout
--recovery-window
--minimum-redundancy

However, there’s no option to specify the actual time of day when backups execute.

Questions:

Is there a way to configure the backup schedule time through the CLI that I’m missing?
Can I manually modify the Barman cron configuration to set a specific time?
- Such changes wouldn’t persist across machine restart/updates, right?
If is not through barman cron, how can we manage the backup schedule?
Is this something support can help configure on a per-instance basis or so?

2. Can backups run from the replica instead of the primary?

I notice some CPU spikes on primary during backups.

A good practice would be to run backups from a standby/replica to minimize load on the primary. Is this possible with our Barman setup?

Any guidance/help would be greatly appreciated! Thanks!

mayailurus · November 26, 2025, 9:49pm

Hi… Do you actually see a mention of cron in the logs around the time the full backup runs?

(I thought the new fly pg backup style instead used a handcrafted Go loop, with an ad hoc timer, etc., , but I could be wrong about that…)

mcass19 · November 27, 2025, 8:29am

Ah, that’s a good point! I don’t see mention of cron in the logs. So yeah, definitely something custom.

Then the question is, would a on-demand backup reset the time the next backup runs? Let’s see…

bira · November 27, 2025, 9:24am

If it’d help, I’m using github actions to backup postgres to tigris. Instead of backup.py you can use your custom script or just pgdump. Here is the code:

name: Fly backup database
run-name: Task
on:
  workflow_dispatch:
  schedule:
    - cron: '25 4 * * *'
jobs:
  backup:
    runs-on: ubuntu-latest
    env:
      FLY_API_TOKEN: ${{ secrets.FLY_API_TOKEN }}
      FLY_DB_APP: postgres
      PGUSER: user
      PGPASSWORD: ${{ secrets.PGPASSWORD }}
      PGDATABASE: user
      PGHOST: localhost
      PGPORT: 5555
      AWS_ACCESS_KEY_ID: ${{ secrets.AWS_ACCESS_KEY_ID }}
      AWS_SECRET_ACCESS_KEY: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
      AWS_ENDPOINT_URL_S3: https://fly.storage.tigris.dev
      AWS_ENDPOINT_URL_IAM: https://fly.iam.storage.tigris.dev
      AWS_REGION: auto
      AWS_BUCKET: postgres

    steps:
      - name: Checkout repository
        uses: actions/checkout@v3

      - name: Set up Python
        uses: actions/setup-python@v4
        with:
          python-version: '3.10'

      - name: Install dependencies
        run: |
          python -m pip install --upgrade pip
          pip install boto3 python-dotenv

      - uses: superfly/flyctl-actions/setup-flyctl@master

      - name: Set filename
        run: echo "filename=$PGDATABASE-$(date -u +"%Y-%m-%d-%H%M%S").dump" >> $GITHUB_ENV

      - name: Dump database and upload to S3
        run: |
          flyctl proxy 5555:5432 -a ${{ env.FLY_DB_APP }} &
          sleep 5
          echo "Dumping database..."
          pg_dump -Fc -f ${{ env.filename }}
          echo "Uploading to S3..."
          python scripts/backup.py upload --bucket ${{ env.AWS_BUCKET }} --source ${{ env.filename }} --destination $PGDATABASE-dumps
          echo "Cleaning up old backups..."
          python scripts/backup.py cleanup --bucket ${{ env.AWS_BUCKET }} --folder $PGDATABASE-dumps --keep 7
          echo "Backup completed successfully!"

mcass19 · November 27, 2025, 11:49am

Thanks @bira! A github action to do it manually would indeed work.

I was wondering if we were able to do some tweaks on just the ‘automatic’ one from Fly.

mcass19 · December 2, 2025, 9:43am

Good news! The timer resets when you do an on-demand backup, so we are off-peak hours now.

Now to see if we can hit a specific machine, or some other strategy to not overload only the primary.

mayailurus · December 4, 2025, 12:01am

Glancing at the source code, there are several explicit checks that prevent any replica from being the one hit, .

(Also, there are no override knobs evident at those points. No isPrimary || flagReallyAllowReplica kinds of things.)

My guess is that this was done to avoid having to maintain a second distinguished Machine, with its own, separate distributed-consensus votes, etc. (Most PG Flex clusters have 3 Machines, incidentally; otherwise, you don’t actually have HA.)

There might also be concerns about unknowingly backing up a lagging replica, , and the like.

mcass19 · December 4, 2025, 9:13am

Thanks @mayailurus for the research and info. Very much appreciated :)!

system · December 11, 2025, 9:13am

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.