2021-04-26T15:21:18.716Z bb777e79 fra [info] sentinel | 2021-04-26T15:21:18.708Z ERROR cmd/sentinel.go:1893 failed to get proxies info {"error": "Unexpected response code: 429"}
2021-04-26T15:21:22.322Z fc43f8a6 iad [info] keeper | 2021-04-26T15:21:22.314Z ERROR cmd/keeper.go:839 failed to update keeper info {"error": "cannot set or renew session for ttl, unable to operate on sessions"}
2021-04-26T15:21:22.341Z fc43f8a6 iad [info] keeper | 2021-04-26T15:21:22.339Z ERROR cmd/keeper.go:1010 error retrieving cluster data {"error": "Unexpected response code: 429"}
2021-04-26T15:21:52.804Z 88e4cbb9 fra [info] keeper | 2021-04-26T15:21:52.796Z ERROR cmd/keeper.go:1010 error retrieving cluster data {"error": "Unexpected response code: 429"}
2021-04-26T15:21:53.002Z 88e4cbb9 fra [info] keeper | 2021-04-26T15:21:52.997Z ERROR cmd/keeper.go:839 failed to update keeper info {"error": "Unexpected response code: 429 (Your IP is issuing too many concurrent connections, please rate limit your calls\n)"}
2021-04-26T15:21:53.917Z bb777e79 fra [info] sentinel | 2021-04-26T15:21:53.913Z ERROR cmd/sentinel.go:1938 error saving clusterdata {"error": "Unexpected response code: 429"}
2021-04-26T15:21:59.543Z bb777e79 fra [info] sentinel | 2021-04-26T15:21:59.538Z WARN cmd/sentinel.go:276 no keeper info available {"db": "42709ddc", "keeper": "fdaa020e2a7b67018602"}
2021-04-26T17:08:01.693Z fc43f8a6 iad [info] sentinel | 2021-04-26T17:08:01.689Z ERROR cmd/sentinel.go:102 election loop error {"error": "failed to read lock: Get \"https://consul-na.fly-shared.net/v1/kv/there-db-emkp298pdwr1orx5/there-db/sentinel-leader?index=233827947&wait=15000ms\": dial tcp [2a09:8280:1:7b37:f8ac:2889:d480:fc91]:443: connect: connection refused"}
2021-04-26T17:08:03.225Z fc43f8a6 iad [info] keeper | 2021-04-26T17:08:03.223Z ERROR cmd/keeper.go:839 failed to update keeper info {"error": "cannot set or renew session for ttl, unable to operate on sessions"}
2021-04-26T17:08:04.552Z fc43f8a6 iad [info] sentinel | 2021-04-26T17:08:04.549Z ERROR cmd/sentinel.go:1843 error retrieving cluster data {"error": "Get \"https://consul-na.fly-shared.net/v1/kv/there-db-emkp298pdwr1orx5/there-db/clusterdata?consistent=&wait=5000ms\": dial tcp [2a09:8280:1:7b37:f8ac:2889:d480:fc91]:443: connect: connection refused"}
2021-04-26T17:08:05.096Z bb777e79 fra [info] sentinel | 2021-04-26T17:08:05.092Z WARN cmd/sentinel.go:276 no keeper info available {"db": "c4193646", "keeper": "fdaa020e2a7bab8018d12"}
2021-04-26T17:08:05.578Z fc43f8a6 iad [info] keeper | 2021-04-26T17:08:05.575Z ERROR cmd/keeper.go:1010 error retrieving cluster data {"error": "Get \"https://consul-na.fly-shared.net/v1/kv/there-db-emkp298pdwr1orx5/there-db/clusterdata?consistent=&wait=5000ms\": dial tcp [2a09:8280:1:7b37:f8ac:2889:d480:fc91]:443: connect: connection refused"}
2021-04-26T17:08:11.693Z fc43f8a6 iad [info] sentinel | 2021-04-26T17:08:11.690Z INFO cmd/sentinel.go:82 Trying to acquire sentinels leadership
2021-04-26T17:56:01.807Z bb777e79 fra [info] sentinel | 2021-04-26T17:56:01.802Z ERROR cmd/sentinel.go:1938 error saving clusterdata {"error": "Put \"https://consul-na.fly-shared.net/v1/kv/there-db-emkp298pdwr1orx5/there-db/clusterdata?cas=297677526&flags=3304740253564472344&wait=5000ms\": EOF"}
No, we’ll look into it! Did these continue or clear up?
These didn’t continue. But another issue arose when I tried to add another instance to the database and it just broke the whole thing (I posted the logs in the other related thread: Postgres does not scale past the initial 2 - #6 by mo.rajbi)
Those logs were continuing indefinitely, and after I even removed the new region, the past regions were still throwing that error regarding
hot standby is not possible because max_worker_processes = 1 is a lower setting than on the master server (its value was 8)
DB started being functional after a few min. It was still acting weird as I had changed the scale count just to bring it up again. But a simple unique SELECT from AMS
to FRA
which the DB was there took anywhere from 300ms
to 1.100s
(compare it to just 6ms
from FRA
!) So I had to sadly temporarily migrate our database to AWS until these are fixed
Ah got it, I’ll keep the other topic updated about that problem.