Health check for your postgres database has failed

Hey guys,

I noticed that my postgres db instance (in fra region) periodically spews out error logs:

health[9185950f40e198] fra [error] Health check for your postgres database has failed. Your database is malfunctioning.

And sometimes I see this in logs:

2023-03-29T18:05:08.818 app[9185950f40e198] fra [info] keeper | 2023-03-29T18:05:08.815Z ERROR cmd/keeper.go:870 failed to update keeper info {"error": "Unexpected response code: 500 (leadership lost while committing log)"}
2023-03-29T18:05:17.300 app[9185950f40e198] fra [info] sentinel | 2023-03-29T18:05:17.298Z ERROR cmd/sentinel.go:1852 error retrieving cluster data {"error": "Unexpected response code: 500"}
2023-03-29T18:05:32.031 app[9185950f40e198] fra [info] sentinel | 2023-03-29T18:05:32.028Z WARN cmd/sentinel.go:276 no keeper info available {"db": "088be137", "keeper": "caca1901f2"}
2023-03-29T18:05:44.336 app[9185950f40e198] fra [info] sentinel | 2023-03-29T18:05:44.335Z ERROR cmd/sentinel.go:1852 error retrieving cluster data {"error": "Unexpected response code: 500"}
2023-03-29T18:05:52.566 app[9185950f40e198] fra [info] sentinel | 2023-03-29T18:05:52.565Z ERROR cmd/sentinel.go:102 election loop error {"error": "Unexpected response code: 500 (Raft leader not found in server lookup mapping)"}
2023-03-29T18:05:54.539 app[9185950f40e198] fra [info] keeper | 2023-03-29T18:05:54.538Z ERROR cmd/keeper.go:1041 error retrieving cluster data {"error": "Unexpected response code: 500"}
2023-03-29T18:06:12.903 app[9185950f40e198] fra [info] sentinel | panic: close of closed channel
2023-03-29T18:06:12.903 app[9185950f40e198] fra [info] sentinel |
2023-03-29T18:06:12.909 app[9185950f40e198] fra [info] sentinel | goroutine 533191 [running]:
2023-03-29T18:06:12.910 app[9185950f40e198] fra [info] sentinel | github.com/superfly/leadership.(*Candidate).initLock(0xc000138000)
2023-03-29T18:06:12.910 app[9185950f40e198] fra [info] sentinel | /go/pkg/mod/github.com/superfly/leadership@v0.2.1/candidate.go:98 +0x2e
2023-03-29T18:06:12.910 app[9185950f40e198] fra [info] sentinel | github.com/superfly/leadership.(*Candidate).campaign(0xc000138000)
2023-03-29T18:06:12.910 app[9185950f40e198] fra [info] sentinel | /go/pkg/mod/github.com/superfly/leadership@v0.2.1/candidate.go:124 +0xc6
2023-03-29T18:06:12.910 app[9185950f40e198] fra [info] sentinel | created by github.com/superfly/leadership.(*Candidate).RunForElection
2023-03-29T18:06:12.911 app[9185950f40e198] fra [info] sentinel | /go/pkg/mod/github.com/superfly/leadership@v0.2.1/candidate.go:60 +0xc5
2023-03-29T18:06:12.916 app[9185950f40e198] fra [info] sentinel | exit status 2

Probably errors are intermittent because right now my apps that have access to the db seem to work fine.

Not sure how long this has been going because I only started shipping logs to a dedicated observability service starting yesterday. There were around 10 health check errors since then.

What can be done about this?

Thanks.

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.