Database erroring way to often than usual?

Hello, Our database’s leader has been spending way to much time in critical state.
It is being restarted more than 10 times a day which is more than it was the whole of the last month.
Now some blame is on since we have to move the events thing our of the db asap but It was stable not long ago.
So we need to know what has changed?

also this is the error logging message

HTTP GET 500 Internal Server Error Output: "[✗] system spent 5.5 of the last 10 seconds waiting for cpu
[✓] 8.93 GB (91.4%!)(MISSING) free space on /data/
[✓] load averages: 0.15 0.23 0.04
[✓] memory: 0.0s waiting over the last 60s
[✓] io: 0.0s waiting over the last 60s"


That many restarts probably means the DB is overloaded. Have you upgraded RAM or VM size?

  1. Check current RAM with fly vm scale
  2. Run fly vm status <id> of the VM that’s restarting to see a few more details on restarts
  3. fly logs -i <id> will show you the most recent logs, normally when it restarts there will be some errors in logs

Thank but it doesn’t make sense we don’t have a lot of traffic and we have upgraded to, here is our current RAM

VM Resources for database
        VM Size: dedicated-cpu-1x
      VM Memory: 2 GB
          Count: 3

Any help?

It looks like it was restarting because that cpu time health check kept failing. We just deployed a new CPU health check to your database, it’s a little less aggressive (it looks at the last minute instead of 10s). If that fixes things, I think you’re good. If you see more CPU wait time check failures it likely means you need to upgrade your VM to dedicated-cpu-2x.