My postgres app somehow got into a state with 2 leaders

I had our app set up with a leader and a replica in the same region. This was working fine for many months. I’m not sure when the issue started but I believe it was around 7am pacific time, when both machines decided that they were the leader. This put our app into a read-only state.

Here’s the app status at the time of failure:

ID              STATE   ROLE    REGION  CHECKS                          IMAGE                           CREATED                 UPDATED
9e784575a09683  started leader  sjc     3 total, 3 passing              flyio/postgres:14.6 (v0.0.41)   2023-02-16T18:02:53Z    2023-07-29T05:09:00Z
9185d56b445e83  started leader  sjc     3 total, 2 passing, 1 critical  flyio/postgres:14.6 (v0.0.41)   2023-02-16T18:02:28Z    2023-08-31T16:04:36Z

I have since stopped one of the machines so that there is only one leader and no failover replica and the database is working again. I’ve emailed support directly but thought I’d post it here as well in case anyone else runs into the same issue or somebody has any idea how this happened in the first place.

1 Like

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.