Postgres database needed to be manually restarted

julia · July 31, 2022, 11:26pm

My Postgres instance started mysteriously failing all connections earlier today, and continued to be unresponsive until I manually restarted it 12 hours later.

I couldn’t find much, just this:

2022-07-31T23:06:43Z app[387d83f3] iad [info]exporter | INFO[1059715] Established new database connection to "fdaa:0:bff:a7b:ab8:0:65c0:2:5432".  source="postgres_exporter.go:970"
2022-07-31T23:06:44Z app[387d83f3] iad [info]exporter | ERRO[1059716] Error opening connection to database (postgresql://flypgadmin:PASSWORD_REMOVED@[fdaa:0:bff:a7b:ab8:0:65c0:2]:5432/postgres?sslmode=disable): dial tcp [fdaa:0:bff:a7b:ab8:0:65c0:2]:5432: connect: connection refused  source="postgres_exporter.go:1658"

Restarting the Postgres instance fixed the issue right away, but I’m a bit puzzled about how to avoid this happening again. Exactly the same thing happened on June 3 2022. The app’s name is mess-with-dns-pg.

Is there a way to add a healthcheck to my Postgres database so that it can automatically restart itself if it gets into a bad state?

kurt · August 1, 2022, 12:08am

It looks like the DB process OOMed several times and then we gave up trying to restart it. We should have cycled the VM when this happened, but I think you may be on an old Fly Postgres build that doesn’t handle this as well. Let me find out if that’s upgradeable.

I think upgrading to 1GB of RAM will prevent this.

julia · August 1, 2022, 4:04pm

thanks so much for looking into it! It looks like it might have automatically upgraded when I restarted it.

julia · August 17, 2022, 12:19pm

This happened again today – it upgraded from v8 to v9 when I restarted it. Just to make sure – what’s the Postgres build version that fixes this issue? (is it v9?)

kurt · August 17, 2022, 12:54pm

Oh that’s actually the job version. If you run fly image show it’ll tell you the postgres image version you’re running. This is the latest:

Image Details
  Registry   = registry-1.docker.io
  Repository = flyio/postgres
  Tag        = 12.10
  Version    = v0.0.25

Did you get an email alert about this? We enabled out-of-memory crash notifications last week.

julia · August 17, 2022, 1:07pm

This is what I’m on, I’m using postgres-standalone instead of postgres:

Image Details
  Registry   = registry-1.docker.io                                                     
  Repository = flyio/postgres-standalone                                                
  Tag        = 14.1                                                                     
  Version    = v0.0.7                                                                   
  Digest     = sha256:ca27c53b81cae713e67d7ced87a4289961db4a81e382b09aaf42ea53032791eb

I did get an email alert, but I’ve been ignoring them because it seems to run out of memory only about once a day, and it seems to only take about 15 seconds to restart. So that feels like an acceptable amount of downtime.

kurt · August 17, 2022, 1:10pm

Oooh right. The standalone PG image doesn’t have the most recent fixes. Let me see if we can get that swapped to the newer image.

kurt · August 17, 2022, 1:24pm

Ok that didn’t work at all, give us a few minutes to see what went wrong. I’m sorry about this!

kurt · August 17, 2022, 1:50pm

We are still working on this, the upgrade attempt got your db into a bad state that we haven’t seen before.

kurt · August 17, 2022, 2:05pm

We restored a backup of your database to mess-with-dns-pg-bak. You can update your app to use it, if you’d like, or just hold tight until we get the original running again.

kurt · August 17, 2022, 2:20pm

Ok everything is up and running. Can you verify that your main PG is acting the way you want? If it’s good, you can delete the backup with fly apps destroy mess-with-dns-pg-bak.

julia · August 17, 2022, 4:22pm

everything looks good to me, thanks so much!

Topic		Replies	Views
Postgres cannot be connect: "Error opening connection to database" Questions / Help postgres	0	1178	September 21, 2022
Postgres instance stuck on started Questions / Help	1	265	February 22, 2023
Connection Issues on Fly Postgres Region Singapore Questions / Help elixir , postgres	4	381	January 12, 2024
POSTGRES is Unresponsive Questions / Help postgres	1	295	August 18, 2022
Fly Postgres machine crashed, won't start or stop postgres	8	65	February 10, 2025

Postgres database needed to be manually restarted

Related topics