Failing Postgres DB connection from app here and there

we are facing the same issues. cause our app can’t connect to the database seem our database is down. it been like ~1hr any help please ?? :pray:

App
  Name     = database          
  Owner    = paypack           
  Version  = 39                
  Status   = running           
  Hostname = database.fly.dev  

Instances
ID       TASK VERSION REGION DESIRED STATUS                 HEALTH CHECKS                  RESTARTS CREATED              
0430e45f app  39      lhr    run     running (failed to co) 3 total, 1 passing, 2 critical 6        2021-10-04T16:37:49Z 
64ff2b63 app  39      lhr    run     running (leader)       3 total, 2 passing, 1 critical 8        2021-10-03T17:31:54Z 

I updated this to use a new, nearby Consul cluster. We’re working through DBs to get this applied in as non disruptive manner as possible.

1 Like

Thanks a lot!

@kurt My app also couldn’t able to connect the postgres database. The app is running on production. This was happening since morning. We are getting lot of complaint from user. So Please take a look resolve this issue.

2022-10-21T06:15:51.694 app[6b06793b] maa [info] sentinel | goroutine 2086 [running]:

2022-10-21T06:15:51.694 app[6b06793b] maa [info] sentinel | github.com/superfly/leadership.(*Candidate).initLock(0xc000136c40)

2022-10-21T06:15:51.694 app[6b06793b] maa [info] sentinel | /go/pkg/mod/github.com/superfly/leadership@v0.2.1/candidate.go:98 +0x2e

2022-10-21T06:15:51.694 app[6b06793b] maa [info] sentinel | github.com/superfly/leadership.(*Candidate).campaign(0xc000136c40)

2022-10-21T06:15:51.694 app[6b06793b] maa [info] sentinel | /go/pkg/mod/github.com/superfly/leadership@v0.2.1/candidate.go:124 +0xc6

2022-10-21T06:15:51.694 app[6b06793b] maa [info] sentinel | created by github.com/superfly/leadership.(*Candidate).RunForElection

2022-10-21T06:15:51.694 app[6b06793b] maa [info] sentinel | /go/pkg/mod/github.com/superfly/leadership@v0.2.1/candidate.go:60 +0xc5

2022-10-21T06:15:51.696 app[6b06793b] maa [info] sentinel | exit status 2

2022-10-21T06:15:51.696 app[6b06793b] maa [info] sentinel | restarting in 3s [attempt 1]

2022-10-21T06:15:54.696 app[6b06793b] maa [info] sentinel | Running...

2022-10-21T06:24:03.689 app[a00f4f14] maa [info] sentinel | 2022-10-21T06:24:03.689Z ERROR cmd/sentinel.go:1852 error retrieving cluster data {"error": "Unexpected response code: 500"}

2022-10-21T06:24:03.697 app[6b06793b] maa [info] sentinel | 2022-10-21T06:24:03.697Z ERROR cmd/sentinel.go:1889 cannot update sentinel info {"error": "Unexpected response code: 500 (rpc error making call: node is not the leader)"}

2022-10-21T06:24:10.786 app[6b06793b] maa [info] keeper | 2022-10-21T06:24:10.786Z ERROR cmd/keeper.go:870 failed to update keeper info {"error": "Unexpected response code: 500 (No cluster leader)"}

2022-10-21T06:40:02.704 app[a00f4f14] maa [info] keeper | 2022-10-21T06:40:02.703Z ERROR cmd/keeper.go:870 failed to update keeper info {"error": "Unexpected response code: 500 (leadership lost while committing log)"}

2022-10-21T06:40:03.187 app[6b06793b] maa [info] keeper | 2022-10-21T06:40:03.186Z ERROR cmd/keeper.go:1041 error retrieving cluster data {"error": "Unexpected response code: 500"}

2022-10-21T06:40:03.231 app[6b06793b] maa [info] sentinel | 2022-10-21T06:40:03.229Z ERROR cmd/sentinel.go:1889 cannot update sentinel info {"error": "Unexpected response code: 500 (rpc error making call: leadership lost while committing log)"}

2022-10-21T06:40:03.310 app[6b06793b] maa [info] keeper | 2022-10-21T06:40:03.310Z ERROR cmd/keeper.go:870 failed to update keeper info {"error": "Unexpected response code: 500 (rpc error making call: leadership lost while committing log)"}

2022-10-21T07:01:23.312 app[6b06793b] maa [info] keeper | 2022-10-21T07:01:23.312Z ERROR cmd/keeper.go:1041 error retrieving cluster data {"error": "Unexpected response code: 500"}

2022-10-21T07:01:23.358 app[a00f4f14] maa [info] sentinel | 2022-10-21T07:01:23.358Z ERROR cmd/sentinel.go:1852 error retrieving cluster data {"error": "Unexpected response code: 500"}

2022-10-21T07:01:23.392 app[a00f4f14] maa [info] keeper | 2022-10-21T07:01:23.392Z ERROR cmd/keeper.go:870 failed to update keeper info {"error": "Unexpected response code: 500 (rpc error making call: node is not the leader)"}

2022-10-21T07:23:27.642 app[6b06793b] maa [info] keeper | 2022-10-21T07:23:27.642Z ERROR cmd/keeper.go:870 failed to update keeper info {"error": "Unexpected response code: 500 (rpc error making call: leadership lost while committing log)"}

2022-10-21T07:23:27.756 app[6b06793b] maa [info] sentinel | 2022-10-21T07:23:27.756Z ERROR cmd/sentinel.go:1947 error saving clusterdata {"error": "Unexpected response code: 500 (rpc error making call: leadership lost while committing log)"}

2022-10-21T07:53:53.110 app[6b06793b] maa [info] keeper | 2022-10-21T07:53:53.110Z ERROR cmd/keeper.go:870 failed to update keeper info {"error": "Unexpected response code: 500 (leadership lost while committing log)"}

2022-10-21T07:53:53.513 app[a00f4f14] maa [info] keeper | 2022-10-21T07:53:53.513Z ERROR cmd/keeper.go:1041 error retrieving cluster data {"error": "Unexpected response code: 500"}

2022-10-21T07:54:01.072 app[a00f4f14] maa [info] sentinel | 2022-10-21T07:54:01.071Z ERROR cmd/sentinel.go:102 election loop error {"error": "failed to read lock: Unexpected response code: 500"}

2022-10-21T08:43:01.102 app[6b06793b] maa [info] keeper | 2022-10-21T08:43:01.101Z ERROR cmd/keeper.go:1041 error retrieving cluster data {"error": "Unexpected response code: 500"}

2022-10-21T08:43:01.266 app[a00f4f14] maa [info] keeper | 2022-10-21T08:43:01.265Z ERROR cmd/keeper.go:1041 error retrieving cluster data {"error": "Unexpected response code: 500"}

2022-10-21T08:43:01.729 app[a00f4f14] maa [info] sentinel | 2022-10-21T08:43:01.729Z ERROR cmd/sentinel.go:1889 cannot update sentinel info {"error": "Unexpected response code: 500 (leadership lost while committing log)"}

2022-10-21T08:43:07.805 app[a00f4f14] maa [info] sentinel | 2022-10-21T08:43:07.805Z ERROR cmd/sentinel.go:102 election loop error {"error": "failed to read lock: Unexpected response code: 500"}

2022-10-21T08:43:09.009 app[6b06793b] maa [info] sentinel | 2022-10-21T08:43:09.008Z ERROR cmd/sentinel.go:102 election loop error {"error": "Unexpected response code: 500 (No cluster leader)"}

2022-10-21T08:43:33.066 app[a00f4f14] maa [info] sentinel | 2022-10-21T08:43:33.066Z ERROR cmd/sentinel.go:1947 error saving clusterdata {"error": "Unexpected response code: 500 (rpc error making call: leadership lost while committing log)"}

2022-10-21T08:43:33.198 app[a00f4f14] maa [info] keeper | 2022-10-21T08:43:33.197Z ERROR cmd/keeper.go:1041 error retrieving cluster data {"error": "Unexpected response code: 500"}

2022-10-21T08:43:33.378 app[6b06793b] maa [info] keeper | 2022-10-21T08:43:33.377Z ERROR cmd/keeper.go:1041 error retrieving cluster data {"error": "Unexpected response code: 500"}

2022-10-21T08:43:33.398 app[6b06793b] maa [info] sentinel | 2022-10-21T08:43:33.397Z ERROR cmd/sentinel.go:1852 error retrieving cluster data {"error": "Unexpected response code: 500"}

2022-10-21T08:43:34.710 app[a00f4f14] maa [info] sentinel | panic: close of closed channel

2022-10-21T08:43:34.710 app[a00f4f14] maa [info] sentinel |

2022-10-21T08:43:34.710 app[a00f4f14] maa [info] sentinel | goroutine 6819 [running]:

2022-10-21T08:43:34.710 app[a00f4f14] maa [info] sentinel | github.com/superfly/leadership.(*Candidate).initLock(0xc0000ae000)

2022-10-21T08:43:34.710 app[a00f4f14] maa [info] sentinel | /go/pkg/mod/github.com/superfly/leadership@v0.2.1/candidate.go:98 +0x2e

2022-10-21T08:43:34.710 app[a00f4f14] maa [info] sentinel | github.com/superfly/leadership.(*Candidate).campaign(0xc0000ae000)

2022-10-21T08:43:34.710 app[a00f4f14] maa [info] sentinel | /go/pkg/mod/github.com/superfly/leadership@v0.2.1/candidate.go:124 +0xc6

2022-10-21T08:43:34.710 app[a00f4f14] maa [info] sentinel | created by github.com/superfly/leadership.(*Candidate).RunForElection

2022-10-21T08:43:34.710 app[a00f4f14] maa [info] sentinel | /go/pkg/mod/github.com/superfly/leadership@v0.2.1/candidate.go:60 +0xc5

2022-10-21T08:43:34.711 app[a00f4f14] maa [info] sentinel | exit status 2

2022-10-21T08:43:34.711 app[a00f4f14] maa [info] sentinel | restarting in 3s [attempt 42]

2022-10-21T08:43:37.712 app[a00f4f14] maa [info] sentinel | Running...

2022-10-21T08:43:42.790 app[6b06793b] maa [info] sentinel | panic: close of closed channel

2022-10-21T08:43:42.790 app[6b06793b] maa [info] sentinel |

2022-10-21T08:43:42.790 app[6b06793b] maa [info] sentinel | goroutine 5778 [running]:

2022-10-21T08:43:42.790 app[6b06793b] maa [info] sentinel | github.com/superfly/leadership.(*Candidate).initLock(0xc0000ae000)

2022-10-21T08:43:42.790 app[6b06793b] maa [info] sentinel | /go/pkg/mod/github.com/superfly/leadership@v0.2.1/candidate.go:98 +0x2e

2022-10-21T08:43:42.790 app[6b06793b] maa [info] sentinel | github.com/superfly/leadership.(*Candidate).campaign(0xc0000ae000)

2022-10-21T08:43:42.790 app[6b06793b] maa [info] sentinel | /go/pkg/mod/github.com/superfly/leadership@v0.2.1/candidate.go:124 +0xc6

2022-10-21T08:43:42.790 app[6b06793b] maa [info] sentinel | created by github.com/superfly/leadership.(*Candidate).RunForElection

2022-10-21T08:43:42.790 app[6b06793b] maa [info] sentinel | /go/pkg/mod/github.com/superfly/leadership@v0.2.1/candidate.go:60 +0xc5

2022-10-21T08:43:42.792 app[6b06793b] maa [info] sentinel | exit status 2

2022-10-21T08:43:42.792 app[6b06793b] maa [info] sentinel | restarting in 3s [attempt 2]

2022-10-21T08:43:45.793 app[6b06793b] maa [info] sentinel | Running...

2022-10-21T09:48:26.669 app[a00f4f14] maa [info] sentinel | 2022-10-21T09:48:26.669Z WARN cmd/sentinel.go:276 no keeper info available {"db": "250e7205", "keeper": "14bf234272"}

2022-10-21T09:50:50.280 app[a00f4f14] maa [info] keeper | 2022-10-21T09:50:50.279Z ERROR cmd/keeper.go:870 failed to update keeper info {"error": "Unexpected response code: 500 (leadership lost while committing log)"}

2022-10-21T09:50:51.104 app[a00f4f14] maa [info] sentinel | 2022-10-21T09:50:51.104Z ERROR cmd/sentinel.go:1852 error retrieving cluster data {"error": "Unexpected response code: 500"}

2022-10-21T09:51:30.572 app[a00f4f14] maa [info] sentinel | 2022-10-21T09:51:30.572Z ERROR cmd/sentinel.go:1889 cannot update sentinel info {"error": "Unexpected response code: 500 (rpc error making call: leadership lost while committing log)"}

2022-10-21T09:51:37.486 app[6b06793b] maa [info] sentinel | 2022-10-21T09:51:37.485Z ERROR cmd/sentinel.go:102 election loop error {"error": "Unexpected response code: 500 (No cluster leader)"}

2022-10-21T09:53:34.554 app[6b06793b] maa [info] keeper | 2022-10-21T09:53:34.554Z ERROR cmd/keeper.go:870 failed to update keeper info {"error": "Unexpected response code: 500 (leadership lost while committing log)"}

2022-10-21T09:53:34.563 app[a00f4f14] maa [info] keeper | 2022-10-21T09:53:34.559Z ERROR cmd/keeper.go:870 failed to update keeper info {"error": "Unexpected response code: 500 (leadership lost while committing log)"}

2022-10-21T09:53:34.615 app[a00f4f14] maa [info] sentinel | 2022-10-21T09:53:34.615Z ERROR cmd/sentinel.go:1852 error retrieving cluster data {"error": "Unexpected response code: 500"}

2022-10-21T09:53:34.728 app[6b06793b] maa [info] sentinel | 2022-10-21T09:53:34.728Z ERROR cmd/sentinel.go:1852 error retrieving cluster data {"error": "Unexpected response code: 500"}

2022-10-21T09:53:34.741 app[6b06793b] maa [info] keeper | 2022-10-21T09:53:34.740Z ERROR cmd/keeper.go:1041 error retrieving cluster data {"error": "Unexpected response code: 500"}

2022-10-21T09:53:42.141 app[a00f4f14] maa [info] sentinel | 2022-10-21T09:53:42.141Z ERROR cmd/sentinel.go:102 election loop error {"error": "Unexpected response code: 500 (No cluster leader)"}

2022-10-21T09:54:01.543 app[6b06793b] maa [info] sentinel | panic: close of closed channel

2022-10-21T09:54:01.543 app[6b06793b] maa [info] sentinel |

2022-10-21T09:54:01.543 app[6b06793b] maa [info] sentinel | goroutine 3244 [running]:

2022-10-21T09:54:01.543 app[6b06793b] maa [info] sentinel | github.com/superfly/leadership.(*Candidate).initLock(0xc0000ae000)

2022-10-21T09:54:01.543 app[6b06793b] maa [info] sentinel | /go/pkg/mod/github.com/superfly/leadership@v0.2.1/candidate.go:98 +0x2e

2022-10-21T09:54:01.543 app[6b06793b] maa [info] sentinel | github.com/superfly/leadership.(*Candidate).campaign(0xc0000ae000)

2022-10-21T09:54:01.543 app[6b06793b] maa [info] sentinel | /go/pkg/mod/github.com/superfly/leadership@v0.2.1/candidate.go:124 +0xc6

2022-10-21T09:54:01.543 app[6b06793b] maa [info] sentinel | created by github.com/superfly/leadership.(*Candidate).RunForElection

2022-10-21T09:54:01.543 app[6b06793b] maa [info] sentinel | /go/pkg/mod/github.com/superfly/leadership@v0.2.1/candidate.go:60 +0xc5

2022-10-21T09:54:01.544 app[6b06793b] maa [info] sentinel | exit status 2

2022-10-21T09:54:01.544 app[6b06793b] maa [info] sentinel | restarting in 3s [attempt 3]

2022-10-21T09:54:04.547 app[6b06793b] maa [info] sentinel | Running...

2022-10-21T09:54:05.085 app[a00f4f14] maa [info] sentinel | panic: close of closed channel

2022-10-21T09:54:05.085 app[a00f4f14] maa [info] sentinel |

2022-10-21T09:54:05.085 app[a00f4f14] maa [info] sentinel | goroutine 3344 [running]:

2022-10-21T09:54:05.085 app[a00f4f14] maa [info] sentinel | github.com/superfly/leadership.(*Candidate).initLock(0xc0000afe30)

2022-10-21T09:54:05.085 app[a00f4f14] maa [info] sentinel | /go/pkg/mod/github.com/superfly/leadership@v0.2.1/candidate.go:98 +0x2e

2022-10-21T09:54:05.085 app[a00f4f14] maa [info] sentinel | github.com/superfly/leadership.(*Candidate).campaign(0xc0000afe30)

2022-10-21T09:54:05.085 app[a00f4f14] maa [info] sentinel | /go/pkg/mod/github.com/superfly/leadership@v0.2.1/candidate.go:124 +0xc6

2022-10-21T09:54:05.085 app[a00f4f14] maa [info] sentinel | created by github.com/superfly/leadership.(*Candidate).RunForElection

2022-10-21T09:54:05.085 app[a00f4f14] maa [info] sentinel | /go/pkg/mod/github.com/superfly/leadership@v0.2.1/candidate.go:60 +0xc5

2022-10-21T09:54:05.086 app[a00f4f14] maa [info] sentinel | exit status 2

2022-10-21T09:54:05.086 app[a00f4f14] maa [info] sentinel | restarting in 3s [attempt 43]

2022-10-21T09:54:08.086 app[a00f4f14] maa [info] sentinel | Running...

2022-10-21T10:00:26.075 app[a00f4f14] maa [info] sentinel | 2022-10-21T10:00:26.075Z ERROR cmd/sentinel.go:1889 cannot update sentinel info {"error": "Unexpected response code: 500 (rpc error making call: leadership lost while committing log)"}

2022-10-21T10:00:26.142 app[a00f4f14] maa [info] keeper | 2022-10-21T10:00:26.142Z ERROR cmd/keeper.go:1041 error retrieving cluster data {"error": "Unexpected response code: 500"}

2022-10-21T10:00:26.310 app[6b06793b] maa [info] sentinel | 2022-10-21T10:00:26.309Z ERROR cmd/sentinel.go:1889 cannot update sentinel info {"error": "Unexpected response code: 500 (rpc error making call: leadership lost while committing log)"}

2022-10-21T10:58:37.065 app[6b06793b] maa [info] sentinel | 2022-10-21T10:58:37.065Z ERROR cmd/sentinel.go:1889 cannot update sentinel info {"error": "Unexpected response code: 500 (rpc error making call: leadership lost while committing log)"}

2022-10-21T10:58:37.109 app[a00f4f14] maa [info] sentinel | 2022-10-21T10:58:37.108Z ERROR cmd/sentinel.go:1852 error retrieving cluster data {"error": "Unexpected response code: 500"}

2022-10-21T10:58:37.131 app[a00f4f14] maa [info] keeper | 2022-10-21T10:58:37.123Z ERROR cmd/keeper.go:1041 error retrieving cluster data {"error": "Unexpected response code: 500"}

2022-10-21T10:58:44.424 app[6b06793b] maa [info] keeper | 2022-10-21T10:58:44.422Z ERROR cmd/keeper.go:870 failed to update keeper info {"error": "Unexpected response code: 500 (No cluster leader)"}

2022-10-21T11:00:02.267 app[6b06793b] maa [info] sentinel | 2022-10-21T11:00:02.266Z WARN cmd/sentinel.go:276 no keeper info available {"db": "59b64c9d", "keeper": "1449234282"}
1 Like

@kurt I am not a Fly user, but a fellow Stolon user.

Have you had a chance to read Stolon’s source code?

I found that in many cases, after Stopping listening, Stolon will never listen again; there’s a TODO in the code that hints at that, and believe I found the exact way how it hangs:

See this and the followup posts.

Hey there, we are actually no longer leveraging the Stolon proxy within our implementation. We ended up replacing it with HAProxy.