Communication with Postgres Cluster dead?

bighitbiker3 · July 24, 2024, 6:40pm

I am seeing a bunch of this stuff:

2024-07-24T18:29:18.082 app[48ed16dc565468] ord [info] admin | [WARN] Failed to connect to fdaa:1:1dfc:a7b:96:6bf8:a15a:2

2024-07-24T18:29:28.082 app[48ed16dc565468] ord [info] admin | [WARN] Failed to connect to fdaa:1:1dfc:a7b:69:efe4:3049:2

2024-07-24T18:29:28.082 app[48ed16dc565468] ord [info] admin | Voting member(s): 3, Active: 1, Inactive: 2, Conflicts: 0

2024-07-24T18:29:38.151 app[81137ea99d16d8] ord [info] proxy | [WARNING] (434) : Server bk_db/pg1 was DOWN and now enters maintenance (unspecified DNS error).

2024-07-24T18:29:38.151 app[81137ea99d16d8] ord [info] proxy | [WARNING] (434) : Server bk_db/pg2 was DOWN and now enters maintenance (unspecified DNS error).

[PP02] could not proxy TCP data to/from instance: failed to copy (direction=client->server, error=Transport endpoint is not connected

My app server cannot connect and I cannot clone any of my DB hosts to try and move them to new machines.

kurt · July 24, 2024, 6:54pm

The logs make this look like Postgres is in a bad state on multiple Machines. Cloning them probably won’t help, and may actually make things worse.

It appears 1 of the 3 original voting member is still working, but it’s in readonly mode.

What I would do is remove the unhealthy / inactive Machines and see if you can get the currently healthy one into a writable state (if it’s the only one running, it will be).

Then dig through the logs and see if there’s any indication of what caused this. My guess would be OOMs or something similar, you may need more RAM.

kurt · July 24, 2024, 6:59pm

Oh I just saw JP helping in your support ticket, gonna close this in favor of that.

Topic		Replies	Views
Postgres App not working anymore postgres	9	403	May 2, 2024
Unable to reach postgres instance postgres , proxy	31	323	October 24, 2024
Unable to restart Fly Postgres cluster	0	291	November 1, 2022
Unable to connect app to postgres	1	379	September 27, 2022
[FAILURE] Postgres stopped working: failed to connect to proxy: context deadline exceeded Questions / Help postgres	7	1070	June 3, 2022

Communication with Postgres Cluster dead?

Related topics