We have a nomad based PG cluster with fly that has somehow gotten into a bad state.
- The Leader is functioning normally
- The replica instance is failing two health checks (but is queryable still)
- Scaling up new replicas result in 2/3 failed health checks as well
The replicas constantly output the follow logs and never complete health checks
- checking stolon status
- Error opening connection to database
Here is a screenshot of the logs and app status