Failed: error running release_command machine: failed to wait for VM {{machine_id}} in stopped state: machine not found

I’ve been experiencing the error below for the past 3 days or so, in about 90% of deploys:

> Waiting for 6839034f793768 to have state: stopped
✖ Failed: error running release_command machine: failed to wait for VM 6839034f793768 in stopped state: machine not found
Error: release command failed - aborting deployment. error running release_command machine: failed to wait for VM 6839034f793768 in stopped state: machine not found (Request ID: 01K42NNT6B728BYA0YNREHF5XX-lax) (Trace ID: 27ef2f8992d4e704111492f2c3d66fee)

I moved regions, and the deploy worked immediately, then the very next one failed with the same error.

  • Deleting the machine with fly machine destroy {{machine_id}} --force didn’t help
  • Deleting the app builder machine, then changing regions of the app builder in https://fly.io/dashboard/xxxxx-xxxxx/settings didn’t help
  • Switching to registry registry2.fly.io seems to have worked for some, but it’s not clear to me how to do that, and doesn’t seem like a permanent fix

Which setting will fix this once and for all?

4 Likes

Just posting here to say I’m getting the exact same issue and it’s causing havoc in my CI/CD :frowning:

1 Like

Same issue. I see it says “> Created release_command machine” and the command that it shows above it looks like it is the one to wait for it to be in the stopped state. I don’t see an api call for the release command if there is one. And then it says “Waiting for e286232f604938 to have state: stopped” which it fails to do after 5 seconds.

DEBUG → GET https://api.machines.dev/v1/apps/app/machines/e286232f604938/wait?instance_id=01K431M8PGRQCJDFN53HY6GW8K&state=stopped&timeout=60

Created release_command machine e286232f604938

Waiting for e286232f604938 to have state: stopped

DEBUG ← 404 https://api.machines.dev/v1/apps/app/machines/e286232f604938/wait?instance_id=01K431M8PGRQCJDFN53HY6GW8K&state=stopped&timeout=60 (5.03s)

DEBUG {
“error”: “machine not found”
}

DEBUG → POST GraphQL Playground

:multiply: Failed: error running release_command machine: failed to wait for VM e286232f604938 in stopped state: machine not found

1 Like

Same here, since Saturday most of our deploys (CI/CD) are failing in multiple apps with the same error.

1 Like

Is there any progress here? Every deployment fails due to this bug

1 Like

Same here

1 Like

Hi :waving_hand: , it looks like yall have been affected by an ongoing incremental rollout of an internal change. I’ve rolled back the change for now and your release_commands should start to work again. Proper fix will go out soon before the incremental rollout is started again.

2 Likes

I can confirm the CD is working again as usual. Thanks for the fix.

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.