pg_dump always gets interrupted while backing up postgres db

We have been getting interruptions/errors starting from around Feb 2025. For the year before that, we have successfully backed up this production postgres DB many times.
The error does not appear immediately. It happens after a few minutes of pg_dump progress. For example, the normal db backup file should be 3.5 GB, but now we cannot go beyond 800 MB due to this error:

....[normal pg_dump logs for other tables]...
pg_dump: processing data for table "public.news"
pg_dump: dumping contents of table "public.news"
pg_dump: processing data for table "public.news_for_users"
pg_dump: dumping contents of table "public.news_for_users"

pg_dump: dumping contents of table "public.news_for_users"
pg_dump: error: Dumping the contents of table "news_for_users" failed: PQgetCopyData() failed.
pg_dump: detail: Error message from server: server closed the connection unexpectedly
        This probably means the server terminated abnormally
        before or while processing the request.
pg_dump: detail: Command was: COPY public.news_for_users (uuid, user_uuid, news_uuid, seen_at, inserted_at, updated_at) TO stdout;

The error is the same for 2 types of attempts:

  1. via ‘flyctl proxy 5430:5432 $REMOTE_DB_HOST -a $FLY_APP_NAME’
  2. via ‘wireguard tunnel’ (Private Networking · Fly Docs)

Any advice please?

I’m not sure if this is related, but when I run fly ssh console it connects for a fraction of a second then I get disconnected. It appears that at the moment there is no way to keep an SSH connection open to any of my machines for more than a second.

We have a Elixir Phoenix web app, and the fly ssh console works for us:

fly ssh console --pty -C "/app/bin/<my_app_name> remote" -a <my_app_name>

So ssh console issue you observe does not seem related to the DB issue here.

@fly.io please help

@mayailurus Would you be able to help? I saw your comments on a related topic: Backup Postgres database for Rails application

@Yaeger , would you be able to help? I read your post How to: off-site backups for Fly.io hosted databases

@rodolfosilva would you be able to help? I read your post [Tutorial] - Automating Fly.io database Backups with GitHub Actions

@fox2688 Are you using the Action from the article? Sometimes this happens in my pipeline, but I’m running the script four times per day as an workaround. My backup is 1.5GB.

I think this is a problem with the Fly Proxy.

I’m not automating the backup currently. I manually execute commands on my computer to save the dump file. It might be a fly proxy issue. I cannot download more than 800MB, often just 50MB since Feb 2025. Before Feb it worked fine.

Try using my action. It’s working…

Nice that it’s working for you. Based on your article, aside from the CI setup, the main job was these 2 lines:

flyctl proxy 5432:5432 -a ${{ secrets.FLY_APP_NAME }}

pg_dump "postgres://${{ secrets.PG_USERNAME }}:${{ secrets.PG_PASSWORD }}@localhost:5432/${{ secrets.PG_DATABASE }}" | gzip > ${{ steps.filename.outputs.filename }}

Put simply, step 1: connect proxy. step 2: pg_dump

From my computer step 1 works fine, step 2 fails half way. I see pg_dump --version logging line by line, but it always gets interrupted after some mega bytes by a message:

pg_dump: error: Dumping the contents of table "<my_table_name>" failed: PQgetCopyData() failed.
pg_dump: detail: Error message from server: server closed the connection unexpectedly
        This probably means the server terminated abnormally
        before or while processing the request.
pg_dump: detail: Command was: ...

Hm… I think the main thing would be to determine whether it’s the network or the server that’s failing, at this point. (It could plausibly be either one!)

Here’s something I would try:

$ fly ssh console -a db-app-name
# su postgres
% df -h /tmp
% pg_dump -p 5433 -d db_name | gzip > /tmp/d.gz  # 5433, not the usual 5432.

The 3.5GB mentioned at the top is large for this context, so if df doesn’t show that much space available (Avail) then repeat from a temporary Machine within that same organization, instead.

(Or extend the PG Machine’s existing volume and dump there.)

If this works, then you can use SFTP or rsync to pull it down locally…

2 Likes

Thank you. Exactly what I experimented with! The app itself typically has low storage, but df -h reveals that /data is where larger volumes are mounted. So I just pg_dump into /data , success!
Then I downloaded the dump file via SFTP.
Why port 5433? I used the normal port, just had to specify host (internal fly host name).

This is a good workaround. Though we expect the normal fly proxy pg_dump to also work in the future.

1 Like

This would have used Unix domain sockets instead of TCP, thus avoiding potential
complications from HAProxy, etc.—i.e., a minor optimization.

(It’s counterintuitive, but the -p isn’t really a port in that context… more of a disambiguating identifier.)

Glad to hear that your preferred way worked!

I found a solution:
The command that encountered the error was:

pg_dump --verbose postgres://$DB_USERNAME:$DB_PASSWORD@localhost:$PROXYED_PORT/$DB_NAME -f $OUTPUT_FILE

The working command now:
(simply adding flag -F c meaning “Format: compressed (not human-readble SQL)”

pg_dump --verbose postgres://$DB_USERNAME:$DB_PASSWORD@localhost:$PROXYED_PORT/$DB_NAME -F c -f $OUTPUT_FILE

By compressing output, the dump is only 800MB and completes successfully. Before compressing, the dump was 3.5GB. This indicates potential problems with flyio: if dump file is larger, the proxy tunnel may fail prematurely.

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.