No changes in my app for weeks. Suddenly I noticed “Internal server error” in my app for all pages. Seems phoenix says DB is down so I go and check the logs for the postgresql db (just one instance and no replicas) and it keeps repeating this:
2022-06-01T17:21:00.981 app[34d538e2] ams [info] keeper | 2022-06-01T17:21:00.981Z INFO cmd/keeper.go:1505 our db requested role is master
2022-06-01T17:21:00.982 app[34d538e2] ams [info] keeper | 2022-06-01T17:21:00.982Z INFO cmd/keeper.go:1543 already master
2022-06-01T17:21:01.006 app[34d538e2] ams [info] keeper | 2022-06-01T17:21:01.006Z INFO cmd/keeper.go:1676 postgres parameters not changed
In a loop forever every 4 seconds. I tried to restart the VM with “flyctl restart” but when it started up it just went back to those messages forever again. I assume there has been some update of the db:s etc from your side recently? Since I haven’t touched anything for weeks. I found this thread (Postgres instance count increase fails) which might be similar issue (but which went unanswered).
Aha, ok! Good to know that at least it isn’t the db that is malfunctioning.
This is the logs from phoenix (i restarted that app as well):
2022-06-01T17:13:49.244 app[51437a33] ams [info] 17:13:49.243 [error] GenServer {Oban.Registry, {Oban, Oban.Peer}} terminating
2022-06-01T17:13:49.244 app[51437a33] ams [info] ** (DBConnection.ConnectionError) connection not available and request was dropped from queue after 1658ms. This means requests are coming in and your connection pool cannot serve them fast enough. You can address this by:
2022-06-01T17:13:49.244 app[51437a33] ams [info] 1. Ensuring your database is available and that you can connect to it
2022-06-01T17:13:49.244 app[51437a33] ams [info] 2. Tracking down slow queries and making sure they are running fast enough
2022-06-01T17:13:49.244 app[51437a33] ams [info] 3. Increasing the pool_size (although this increases resource consumption)
2022-06-01T17:13:49.244 app[51437a33] ams [info] 4. Allowing requests to wait longer by increasing :queue_target and :queue_interval
2022-06-01T17:13:49.244 app[51437a33] ams [info] See DBConnection.start_link/2 for more information
2022-06-01T17:13:49.244 app[51437a33] ams [info] (db_connection 2.4.2) lib/db_connection.ex:904: DBConnection.transaction/3
2022-06-01T17:13:49.244 app[51437a33] ams [info] (oban 2.11.3) lib/oban/peer.ex:147: Oban.Peer.handle_info/2
2022-06-01T17:13:49.244 app[51437a33] ams [info] (stdlib 3.16.1) gen_server.erl:695: :gen_server.try_dispatch/4
2022-06-01T17:13:49.244 app[51437a33] ams [info] (stdlib 3.16.1) gen_server.erl:437: :gen_server.loop/7
2022-06-01T17:13:49.244 app[51437a33] ams [info] (stdlib 3.16.1) proc_lib.erl:226: :proc_lib.init_p_do_apply/3
2022-06-01T17:13:49.244 app[51437a33] ams [info] Last message: {:continue, :start}
2022-06-01T17:13:50.372 app[51437a33] ams [info] 17:13:50.372 [error] Postgrex.Protocol (#PID<0.2883.0>) failed to connect: ** (DBConnection.ConnectionError) tcp recv (idle): closed