Unmanaged Postgres backup takes over 1 hour for 500MB database

So, I am trying to back up my pg database using fly proxy and it now takes over an hour to back up and download a 500MB database. 48 hours ago this took 2, maybe 3, minutes.

Am I suddenly doing something wrong, or has something gone horribly wrong from fly.io’s side?

Just a guess here: have a look at your CPU metrics in Grafana. If you’re maxing out the CPU, it may be being throttled aggressively.

What region are your PG machines in?

I don’t think that’s the issue. Thanks anyway. Region is SJC (San Jose).

1 Like

Were you in sea (Seattle) before?

If this was a migration, then the volumes might still be in the slower “hydrating” state:

On a different tack, a couple other users mentioned similarly poor download speeds in the past few days; one had good luck with switching to kernel-mode WireGuard instead.

(I wonder whether a surge in volume migrations as regions are being shutdown is reducing effective network capacity, :thinking:…)

Oddly enough, I was in sea! Don’t know what happened there to be honest.

Having said that my volumes are not in hydratingstate:fly volume list -a synergia-db

ID STATE NAME SIZE REGION ZONE ENCRYPTED ATTACHED VM CREATED AT
vol_4m8y5kq3ne36zk6r created pg_data 2GB sjc 2617 true 080e394b504358 1 day ago

vol_vx257d6gd91y29zr created pg_data 2GB sjc 7917 true 286594db7dd608 1 day ago

vol_493gqozk0696l064 created pg_data 2GB sjc 4f1d true 7815123c591108 1 day ago

This “regions are being shutdown” part happened yesterday?

I’ll try the “switching to kernel-mode WireGuard” and get back here.

1 Like

It does seem like they’ve started auto-migrating apps that have volumes from seasjc recently, but I’m just inferring that from forum traffic, pretty much. (I.e., no inside scoop there.)

The blog post above didn’t really give a specific schedule…

Thank you!! I appreciate you taking the time to reply.

I installed WireGuard and created the tunnel, which is now active. Now what? I’m sorry, but if you point me in the right direction would be enough.

Thank you again.

No worries! You should be able to reach synergia-db.internal now, from your local development laptop/desktop.

I would try with psql -d $DATABASE_URL first. That should give you an interactive CLI prompt on the remote Machine. (You can then exit it with \q.)

If that succeeds, then pg_dump should be able to use that same connection string…

2 Likes

Thank you!!

Still slow though, instead of an hour or more, it now takes closer to 20 (edit: 30!) minutes (which was 2 minutes before).

Is this the new normal?

Alas! This means I’ll have to switch providers.

Thanks a million @mayailurus !!

I doubt this is permanent. Now that three different people have reported similar slowdowns, I suspect there’s an investigation afoot…

Let’s hope so!! I can’t continue like this for long honestly. Thank you for everything!!

1 Like

Hm… If the 30 minutes is onerous, rather than just disappointing, there’s a slightly different method you could try in the interim. This compresses on the remote Machine before attempting transfer:

https://community.fly.io/t/pg-dump-always-gets-interrupted-while-backing-up-postgres-db/23842/12

Moreover, since that download is a distinct step, it is natural to use SFTP or rsync, both of which are better at WAN connections than the Postgres protocol is. (Possibly, part of what is hurting performance suddenly is the sjcsea latency added into each operation.)

A different user had luck with just uploading the dump file to Tigris from within the remote Machine, :sweat_smile:

Yeah, no. That’s waaaay too much work and thought for a process that should take neither. Believe it or not, it would be both easier and faster, on the long run, to switch providers. I already have my Hetzner machines on standby. I’ll just wait a few before pulling the trigger.

And it’s cheaper!!

Hi @Lulu,

Your database machines were migrated from sea to sjc as part of region consolidation, we sent an e-mail about this on September 25th and there’s more information here.

We’ve seen a couple of reports of slow Postgres operations via fly proxy on migrated machines, it’s something we’re tracking down - so thanks for reporting this, we’ll have a look at whether something looks odd in your configuration and that might explain the slowness. In the meanwhile, using wireguard is something a few other affected users have had success with, it’s explained here and then you do your pg dump to your-postgres-app.internal:5432 instead of localhost:(proxied-port). But rest assured we’ll look at the fly proxy slowness.

  • Daniel
1 Like

Thank you. I look forward to this being resolved sooner rather than later.

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.