Machine silently stuck in "failed" state

butaca · September 19, 2023, 4:22pm

I have an app with 1 machine responding to some web hooks, with scale to zero. Very light stuff. The app uses little memory. Suddenly it stopped working and when I checked the logs I saw that it failed to start because of “insufficient memory”, and after a few attempts the machine got stuck in a “failed” state. I fixed the issue by destroying the machine and creating a new one (fly scale count 0 and then fly scale count 1).

The thing is I didn’t change anything in the app. Is the same deploy that was working properly, it stopped doing so and started working again after recreating the machine, without changing anything.

I know that I should have at least 2 machines per app but regardless, I have a few questions/concerns:

Why did this happen? Is the same deploy, with the same memory consumption, same config, that was working properly and suddenly got stuck in a “failed” state.
Can I be notified somehow if a machine is stuck in a failed state without checking constantly the monitoring page?

Thanks!

system · September 26, 2023, 4:22pm

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Deploy consistently failing	2	68	September 16, 2024
Machine stuck in "replacing" state machines	12	137	March 4, 2025
Starting suspended machines on deploy	4	59	March 25, 2025
[PM07] failed to change machine state: machine still active, refusing to start	2	34	February 28, 2025
Machine stuck in replacing state Build debugging	49	2052	February 21, 2025

Machine silently stuck in "failed" state

Related topics