[SOLVED] machines API returned an error: "machine still attempting to start"

i posted this one a while back: Machine still attempting to start, it’s closed now, but i wanted to give an update to it in a way that the community can discover it, since it seems i have found a culprit (there might be other reasons as to why you would get that error message). I’ve also seen many posting issues of the same kind, but that also don’t seem to resolve in any way.

For me it was machine suspension, so setting the fly.toml to have:

auto_stop_machines = "stop"

and not using suspend seemed to do the trick, now it’s a bit slower to cold-boot my webapps that have volumes, but at least it’s reliable. I think i would at least keep suspension as a likely culprit for “unexplainable hard to debug” issues in the back of the head going forwards :thinking:.

Or rather, I should say that I’ve never experienced the failure since, and it’s been several weeks now, whereas the issue would happen almost guaranteed within 24 hours, so I’m fairly confident this “solves” the issue.

Although I’m happy that i found a fix/workaround, i’m not happy with how i got there, i essentially guessed it :sweat_smile: (running out of ideas to try). I couldn’t find anything (that made sense to me) in the logs from the Machines API. it would have been very cool to see maybe a single line “waking machine from suspension” in the logs maybe? (only if suspension is a “different enough flow of execution” from non-suspension that it’s warranted? i know it’s supposed to be an invisible “behind the scenes” feature from the users perspective, but I think it can help in debugging a feature)… i had also honestly forgotten about the setting, since it seemed to work a while back, or maybe it worked before i made some other change?

I know logs are a balancing act too, if you put too much stuff in them I would get overloaded and blind as well. :sweat_smile:

I also want to add that I’m not by any means certain of anything… it just seems like suspension was relevant to my issues. There was correlation.

1 Like

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.