Postgresql DB down and restarts don't help

victorbjorklund · June 1, 2022, 5:25pm

No changes in my app for weeks. Suddenly I noticed “Internal server error” in my app for all pages. Seems phoenix says DB is down so I go and check the logs for the postgresql db (just one instance and no replicas) and it keeps repeating this:

2022-06-01T17:21:00.981 app[34d538e2] ams [info] keeper | 2022-06-01T17:21:00.981Z INFO cmd/keeper.go:1505 our db requested role is master

2022-06-01T17:21:00.982 app[34d538e2] ams [info] keeper | 2022-06-01T17:21:00.982Z INFO cmd/keeper.go:1543 already master

2022-06-01T17:21:01.006 app[34d538e2] ams [info] keeper | 2022-06-01T17:21:01.006Z INFO cmd/keeper.go:1676 postgres parameters not changed

In a loop forever every 4 seconds. I tried to restart the VM with “flyctl restart” but when it started up it just went back to those messages forever again. I assume there has been some update of the db:s etc from your side recently? Since I haven’t touched anything for weeks. I found this thread (Postgres instance count increase fails) which might be similar issue (but which went unanswered).

kurt · June 1, 2022, 5:27pm

What do your Phoenix logs say? Those Postgres logs are normal (it’s actually saying Postgres is the master and in a good state).

victorbjorklund · June 1, 2022, 5:32pm

Aha, ok! Good to know that at least it isn’t the db that is malfunctioning.

This is the logs from phoenix (i restarted that app as well):

2022-06-01T17:13:49.244 app[51437a33] ams [info] 17:13:49.243 [error] GenServer {Oban.Registry, {Oban, Oban.Peer}} terminating

2022-06-01T17:13:49.244 app[51437a33] ams [info] ** (DBConnection.ConnectionError) connection not available and request was dropped from queue after 1658ms. This means requests are coming in and your connection pool cannot serve them fast enough. You can address this by:

2022-06-01T17:13:49.244 app[51437a33] ams [info] 1. Ensuring your database is available and that you can connect to it

2022-06-01T17:13:49.244 app[51437a33] ams [info] 2. Tracking down slow queries and making sure they are running fast enough

2022-06-01T17:13:49.244 app[51437a33] ams [info] 3. Increasing the pool_size (although this increases resource consumption)

2022-06-01T17:13:49.244 app[51437a33] ams [info] 4. Allowing requests to wait longer by increasing :queue_target and :queue_interval

2022-06-01T17:13:49.244 app[51437a33] ams [info] See DBConnection.start_link/2 for more information

2022-06-01T17:13:49.244 app[51437a33] ams [info] (db_connection 2.4.2) lib/db_connection.ex:904: DBConnection.transaction/3

2022-06-01T17:13:49.244 app[51437a33] ams [info] (oban 2.11.3) lib/oban/peer.ex:147: Oban.Peer.handle_info/2

2022-06-01T17:13:49.244 app[51437a33] ams [info] (stdlib 3.16.1) gen_server.erl:695: :gen_server.try_dispatch/4

2022-06-01T17:13:49.244 app[51437a33] ams [info] (stdlib 3.16.1) gen_server.erl:437: :gen_server.loop/7

2022-06-01T17:13:49.244 app[51437a33] ams [info] (stdlib 3.16.1) proc_lib.erl:226: :proc_lib.init_p_do_apply/3

2022-06-01T17:13:49.244 app[51437a33] ams [info] Last message: {:continue, :start}

2022-06-01T17:13:50.372 app[51437a33] ams [info] 17:13:50.372 [error] Postgrex.Protocol (#PID<0.2883.0>) failed to connect: ** (DBConnection.ConnectionError) tcp recv (idle): closed

victorbjorklund · June 1, 2022, 7:53pm

In case someone lands here in the future. This seems to have helped:

Now it works again.

Elder · June 1, 2022, 9:35pm

I had similar issues with postgres today also in ams region

Topic		Replies	Views
After an incident Postgres app is constantly logging INFO cmd/keeper.go messages Questions / Help postgres , appsv2 , machines	1	240	October 30, 2023
Postgres clusters periodically down across many of our organizations Questions / Help postgres	7	1650	October 13, 2022
Possible issue with database	27	3374	March 2, 2022
Postgres down and won't restart	3	357	December 7, 2021
Postgres issue/"down"?	4	341	August 10, 2022

Postgresql DB down and restarts don't help

Related topics