Are these Postgres logs normal?

mo.rajbi · April 27, 2021, 7:17am

2021-04-26T15:21:18.716Z bb777e79 fra [info] sentinel          | 2021-04-26T15:21:18.708Z	ERROR	cmd/sentinel.go:1893	failed to get proxies info	{"error": "Unexpected response code: 429"}
2021-04-26T15:21:22.322Z fc43f8a6 iad [info] keeper            | 2021-04-26T15:21:22.314Z	ERROR	cmd/keeper.go:839	failed to update keeper info	{"error": "cannot set or renew session for ttl, unable to operate on sessions"}
2021-04-26T15:21:22.341Z fc43f8a6 iad [info] keeper            | 2021-04-26T15:21:22.339Z	ERROR	cmd/keeper.go:1010	error retrieving cluster data	{"error": "Unexpected response code: 429"}
2021-04-26T15:21:52.804Z 88e4cbb9 fra [info] keeper            | 2021-04-26T15:21:52.796Z	ERROR	cmd/keeper.go:1010	error retrieving cluster data	{"error": "Unexpected response code: 429"}
2021-04-26T15:21:53.002Z 88e4cbb9 fra [info] keeper            | 2021-04-26T15:21:52.997Z	ERROR	cmd/keeper.go:839	failed to update keeper info	{"error": "Unexpected response code: 429 (Your IP is issuing too many concurrent connections, please rate limit your calls\n)"}
2021-04-26T15:21:53.917Z bb777e79 fra [info] sentinel          | 2021-04-26T15:21:53.913Z	ERROR	cmd/sentinel.go:1938	error saving clusterdata	{"error": "Unexpected response code: 429"}
2021-04-26T15:21:59.543Z bb777e79 fra [info] sentinel          | 2021-04-26T15:21:59.538Z	WARN	cmd/sentinel.go:276	no keeper info available	{"db": "42709ddc", "keeper": "fdaa020e2a7b67018602"}
2021-04-26T17:08:01.693Z fc43f8a6 iad [info] sentinel          | 2021-04-26T17:08:01.689Z	ERROR	cmd/sentinel.go:102	election loop error	{"error": "failed to read lock: Get \"https://consul-na.fly-shared.net/v1/kv/there-db-emkp298pdwr1orx5/there-db/sentinel-leader?index=233827947&wait=15000ms\": dial tcp [2a09:8280:1:7b37:f8ac:2889:d480:fc91]:443: connect: connection refused"}
2021-04-26T17:08:03.225Z fc43f8a6 iad [info] keeper            | 2021-04-26T17:08:03.223Z	ERROR	cmd/keeper.go:839	failed to update keeper info	{"error": "cannot set or renew session for ttl, unable to operate on sessions"}
2021-04-26T17:08:04.552Z fc43f8a6 iad [info] sentinel          | 2021-04-26T17:08:04.549Z	ERROR	cmd/sentinel.go:1843	error retrieving cluster data	{"error": "Get \"https://consul-na.fly-shared.net/v1/kv/there-db-emkp298pdwr1orx5/there-db/clusterdata?consistent=&wait=5000ms\": dial tcp [2a09:8280:1:7b37:f8ac:2889:d480:fc91]:443: connect: connection refused"}
2021-04-26T17:08:05.096Z bb777e79 fra [info] sentinel          | 2021-04-26T17:08:05.092Z	WARN	cmd/sentinel.go:276	no keeper info available	{"db": "c4193646", "keeper": "fdaa020e2a7bab8018d12"}
2021-04-26T17:08:05.578Z fc43f8a6 iad [info] keeper            | 2021-04-26T17:08:05.575Z	ERROR	cmd/keeper.go:1010	error retrieving cluster data	{"error": "Get \"https://consul-na.fly-shared.net/v1/kv/there-db-emkp298pdwr1orx5/there-db/clusterdata?consistent=&wait=5000ms\": dial tcp [2a09:8280:1:7b37:f8ac:2889:d480:fc91]:443: connect: connection refused"}
2021-04-26T17:08:11.693Z fc43f8a6 iad [info] sentinel          | 2021-04-26T17:08:11.690Z	INFO	cmd/sentinel.go:82	Trying to acquire sentinels leadership
2021-04-26T17:56:01.807Z bb777e79 fra [info] sentinel          | 2021-04-26T17:56:01.802Z	ERROR	cmd/sentinel.go:1938	error saving clusterdata	{"error": "Put \"https://consul-na.fly-shared.net/v1/kv/there-db-emkp298pdwr1orx5/there-db/clusterdata?cas=297677526&flags=3304740253564472344&wait=5000ms\": EOF"}

kurt · April 27, 2021, 1:11pm

No, we’ll look into it! Did these continue or clear up?

mo.rajbi · April 27, 2021, 1:26pm

These didn’t continue. But another issue arose when I tried to add another instance to the database and it just broke the whole thing (I posted the logs in the other related thread: Postgres does not scale past the initial 2 - #6 by mo.rajbi)

Those logs were continuing indefinitely, and after I even removed the new region, the past regions were still throwing that error regarding

hot standby is not possible because max_worker_processes = 1 is a lower setting than on the master server (its value was 8)

DB started being functional after a few min. It was still acting weird as I had changed the scale count just to bring it up again. But a simple unique SELECT from AMS to FRA which the DB was there took anywhere from 300ms to 1.100s (compare it to just 6ms from FRA!) So I had to sadly temporarily migrate our database to AWS until these are fixed

kurt · April 27, 2021, 2:13pm

Ah got it, I’ll keep the other topic updated about that problem.

Topic		Replies	Views
Possible issue with database	27	3319	March 2, 2022
Health check for your postgres database has failed	1	594	April 6, 2023
Postgres server down postgres	1	317	January 10, 2023
Election loop and RPC errors in Postgres HA postgres	3	314	December 2, 2022
Postgres clusters periodically down across many of our organizations Questions / Help postgres	7	1624	October 13, 2022

Are these Postgres logs normal?

Related topics