Cpu usage went up for no reason after problem with deploy

Hi, on the 11th we made a deployment to our app, nothing special really and nothing that changes anything with database or pretty much anything that would cause the problem i will explain now, most of it was just design changes.

But when the deployment was running, for some reason we got errors with fly saying that the server returned 500 and other stuff. In the dashboard on fly it said that the server was running, but we couldnt reach i and when i checked monitoring on the dashboard the console for the server said “could not find a good candidate within 90 attempts at load balancing.” , and when i tried restarting / stopping it from the terminal, it said that it couldnt find the machine and after a while it said when we tried deploy again that it couldnt find a lease.

The only way that i finally was able to fix it was when i scaled our app to 2 machines, then all of a sudden our first machine worked again. I then removed the 2nd machine, but i noticed right away that our site was slower. And when we are checking grafana we can see that after the release, both load avg and cpu usage have gone up on our database (that we made no changes to) and our app server as well.

Does anyone have an idea what could have happened? And what we can try to fix it. We can clearly see in the attached image that there was a downtime for an hour on the 11th, and when it was up again, the CPU usage and load avg went up. We have ofc tried to restart everything without any luck.

Our database:


And our app machine:


It might be related to this problem as well: New build - machine stuck

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.