Hello, Our database’s leader has been spending way to much time in critical state.
It is being restarted more than 10 times a day which is more than it was the whole of the last month.
Now some blame is on since we have to move the events thing our of the db asap but It was stable not long ago.
So we need to know what has changed?
also this is the error logging message
Description
HTTP GET http://172.19.0.82:5500/flycheck/vm: 500 Internal Server Error Output: "[✗] system spent 5.5 of the last 10 seconds waiting for cpu
[✓] 8.93 GB (91.4%!)(MISSING) free space on /data/
[✓] load averages: 0.15 0.23 0.04
[✓] memory: 0.0s waiting over the last 60s
[✓] io: 0.0s waiting over the last 60s"
Status
Critical
Entity
database-8330c4ac
Check
vm
It looks like it was restarting because that cpu time health check kept failing. We just deployed a new CPU health check to your database, it’s a little less aggressive (it looks at the last minute instead of 10s). If that fixes things, I think you’re good. If you see more CPU wait time check failures it likely means you need to upgrade your VM to dedicated-cpu-2x.