Postgres DBs throwing alerts

Our Postgres DBs in two different apps are going back and forth between alerting and okay repeatedly. It started in both apps at the same time. The errors are:

HTTP GET http://<IPv4-ADDRESS>:5500/flycheck/pg: 500 Internal Server Error Output: "[✗] proxy: context deadline exceeded"

Followed usually within a few seconds with a success:

HTTP GET http://<IPv4-ADDRESS>:5500/flycheck/pg: 200 OK Output: "[✓] replication: currently leader
[✓] proxy check: [<IPv6-ADDRESS>]:5432 connected
[✓] connections: 15 used, 3 reserved, 300 max"

It’s been happening every few minutes for about the last hour and a half.

Is there something I need to do to remedy this, or is it an issue on Fly’s end?

1 Like

I’ve been seeing regular postgres errors for the past hour at least as well. tcp connections being force closed then connection refused for a few seconds then back like the server is cycling regularly.

1 Like

These both look like rate limiting issues connecting to the shared consul service. What regions is your DB running in? We’re investigating.

yyz here

ord for me

See if it’s better now? We increase the rate limits on Consul. It seems like they were near the cusp but consul is otherwise super happy.

2 Likes

I saw a round of errors about 15 minutes ago. I’ll keep an eye on it and let you know.

That fits, here’s what the response graph looks like for the consul your DBs are using:

The burst of errors there were when we updated the config, then traffic flattened back out (traffic to this should be really flat).

Seems to be all good here now. Thank you.

1 Like

Yep, been stable for the last hour here as well. Thanks.

1 Like

This seems to be back, though not as severe. I’ve seen 15+ instances of the error so far today on two different apps in ORD. @kurt is this same issue, or something new?