Fly Postgres Connections Extremely Slow from Django

Hi, kinda a bit confused right now. Am using Fly Postgres along with a Django app and it’s been working perfectly for about a month. A few hours ago it started being really slow. The last deploy was several hours before that started and changed nothing related to db stuff, and rolling back that deploy also didn’t fix it. If I look in Sentry at my performance things I see stuff like this.


Sometimes the connect will only be a few seconds, or sometimes it’s well over a minute. Assuming I’m understanding this right though, the entire request took over 41 seconds(!), with 40 seconds being spent on the connect thing.

I’m really lost here. I can’t think of anything in my own app that would’ve suddenly caused this. I’ve tried restarting both the postgres cluster and my app. All I can think of is something with Flycast, but no one else has made similar threads recently which kinda leads me to believe this is isolated to just me.

Repo in case it’s useful at all: GitHub - splashcat-ink/splashcat: Splatoon battle tracking

hey @catgirl ! sorry to hear that. Is this still going?

I see two things for your app, one is that the DB ran out of memory on July 18th and a notification was sent. The shared-cpu-256 postgres nodes could be struggling to keep up with the queries as data is added over time.

The second is a notable increase on app response times starting at 21:00 UTC (the graph below is in UTC-3) that correlates with two Vault incidents we had today.

2 Likes

Think the vault incidents must have been the cause as it seems to be fine now. Am confused where the vault would be involved though, thought that was only involved in deployments?

I’ve been meaning to figure out how to upgrade, but scared of messing it up. Really wish Fly Postgres was more of a managed service :stuck_out_tongue: constantly afraid of screwing up somewhere. LiteFS Cloud seems nice with a web gui for rollbacks and such, and I wish Fly Postgres had that. Just LiteFS won’t work for me at the moment because I run a Celery worker and haven’t yet figured out how to get it running well in the same container.

Welp finally got around to upgrading this and seems like I might’ve broken some stuff. Am getting this spammed in my console multiple times a second.

2023-07-25T05:59:41.753 app[683d315a704568] iad [info] postgres | 2023-07-25 05:59:41.753 UTC [4159] WARNING: database "repmgr" has a collation version mismatch

2023-07-25T05:59:41.753 app[683d315a704568] iad [info] postgres | 2023-07-25 05:59:41.753 UTC [4159] DETAIL: The database was created using collation version 2.31, but the operating system provides version 2.36.

2023-07-25T05:59:41.753 app[683d315a704568] iad [info] postgres | 2023-07-25 05:59:41.753 UTC [4159] HINT: Rebuild all objects in this database that use the default collation and run ALTER DATABASE repmgr REFRESH COLLATION VERSION, or build PostgreSQL with the right library version.

2023-07-25T05:59:41.753 app[683d315a704568] iad [info] repmgrd | WARNING: database "repmgr" has a collation version mismatch

2023-07-25T05:59:41.753 app[683d315a704568] iad [info] repmgrd | DETAIL: The database was created using collation version 2.31, but the operating system provides version 2.36.

2023-07-25T05:59:41.753 app[683d315a704568] iad [info] repmgrd | HINT: Rebuild all objects in this database that use the default collation and run ALTER DATABASE repmgr REFRESH COLLATION VERSION, or build PostgreSQL with the right library version.

No idea how to fix this >.<

I was able to increase the memory of each machine in my cluster though. I think I need to change configuration things but not sure how?

1 Like

I assume you ran the HINT and it didn’t work, is it?

This may help you: https://community.fly.io/t/postgres-flex-database-postgres-has-a-collation-version-mismatch

3 Likes

I wasn’t sure what exactly I was supposed to do to rebuild objects and stuff, not very familiar with Postgres stuff. :frowning: I was able to fix it with @smorimoto’s thread though.

As for this, I ended up just doubling the shared-buffers config to 13056 * 8kB. Tried looking around in flyctl to see how it’s configured for new clusters but couldn’t find it.

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.