Machine not found

Hello I’ve got this error sometimes while flyctl deploy

release command failed - aborting deployment. error finding the release_command machine
17815dedbexxxx exit event: error getting machine 17815dedbxxxx from api: failed to get VM
17815dedbexxxx: Machine not found (Request ID: 01HTSYMS56QPJ2XS2DTDV8XXXX-iad)

This is not always reproductible, do you have any idea ? I haven’t changed anything in my code / config, thanks

2 Likes

I just got something similar. I’m setting up my GitLab CI/CD pipeline, and flyctl deploy failed with the same message.

Failed: error finding the release_command machine 80e441f66e0228 exit event: error getting machine 80e441f66e0228 from api: failed to get VM 80e441f66e0228: Machine not found
Checking DNS configuration for my-app.fly.dev
Error: release command failed - aborting deployment. error finding the release_command machine 80e441f66e0228 exit event: error getting machine 80e441f66e0228 from api: failed to get VM 80e441f66e0228: Machine not found (Request ID: 01HTYZ722RVC4PF3T9ZD73HXEF-iad)

From the respective live logs in Fly:

2024-04-08T13:43:57.589 runner[80e441f66e0228] ord [info] Pulling container image registry.fly.io/my-app:deployment-01HTYZ22MKTASR0EMXH0Z0X14S
2024-04-08T13:44:01.933 runner[80e441f66e0228] ord [info] Successfully prepared image registry.fly.io/my-app:deployment-01HTYZ22MKTASR0EMXH0Z0X14S (4.343501929s)
2024-04-08T13:44:02.812 runner[80e441f66e0228] ord [info] Configuring firecracker
2024-04-08T13:44:02.965 app[80e441f66e0228] ord [info] [ 0.038294] PCI: Fatal: No config space access function found
2024-04-08T13:44:03.189 app[80e441f66e0228] ord [info] INFO Starting init (commit: 5b8fb02)...
2024-04-08T13:44:03.212 app[80e441f66e0228] ord [info] INFO Preparing to run: `/app/bin/migrate` as nobody
2024-04-08T13:44:03.222 app[80e441f66e0228] ord [info] INFO [fly api proxy] listening at /.fly/api
2024-04-08T13:44:03.230 app[80e441f66e0228] ord [info] 2024/04/08 13:44:03 listening on [fdaa:4:f22:a7b:110:4097:3eb3:2]:22 (DNS: [fdaa::3]:53)
2024-04-08T13:44:03.234 runner[80e441f66e0228] ord [info] Machine created and started in 5.648s
2024-04-08T13:44:05.338 app[80e441f66e0228] ord [info] 13:44:05.336 [info] Migrations already up
2024-04-08T13:44:06.224 app[80e441f66e0228] ord [info] INFO Main child exited normally with code: 0
2024-04-08T13:44:06.225 app[80e441f66e0228] ord [info] WARN Reaped child process with pid: 356 and signal: SIGUSR1, core dumped? false
2024-04-08T13:44:06.238 app[80e441f66e0228] ord [info] INFO Starting clean up.
2024-04-08T13:44:06.247 app[80e441f66e0228] ord [info] [ 3.316265] reboot: Restarting system 

Not sure what’s wrong, but the PCI: Fatal: No config space access function found doesn’t sound great to me

Edit: fly deploy works on my local dev machine, but just not the CI runner. I’m new to Elixir/Phoenix and Fly, so I could have set something up wrong. My deploy job in GitLab is just this script:

apt-get update && apt-get install -y curl
curl -L https://fly.io/install.sh | sh
export FLYCTL_INSTALL="/root/.fly"
export PATH="$FLYCTL_INSTALL/bin:$PATH"
flyctl deploy

Hey y’all, about these release_command errors, does your fly.toml declare a release_command under [deploy]? That error can mean the file associated with the configured release command is missing or doesn’t have executable permissions, afaik. The error message probably could be more descriptive here :sweat_smile:

@astoria_brian , that PCI: Fatal: message is disconcerting but is apparently benign. It happens because the VM being used here doesn’t have a legacy PCI bus: Getting PCI: Fatal: No config space access function found - #3 by catflydotio

Are those logs you shared what you get when running fly deploy on your local machine? That looks to me like the release command executing and exiting successfully, although I’m not entirely sure.

Happening to me as well, intermittently failing in CI.

:heavy_multiplication_x: [1/3] Failed to create canary machine: smoke checks for 286512dce70998 failed: error getting machine 286512dce70998 from api: failed to get VM 286512dce70998: Machine not found
118

Local fly deploy works but hangs on “Checking DNS configuration” (but still works). I also have a release_command but logs indicate that it ran succesfully.

I am facing exactly the same issue with canary deployment at the moment. Did you manage to find the solution?

Yes, my release command in my fly.toml is release_command = '/app/bin/migrate'

fly ssh console --pty -C /bin/bash
root@148e442b290258:/app# ls -l /app/bin/migrate
-rwxrwxrwx 1 nobody root 98 Apr  8 13:43 /app/bin/migrate

It appears to be there and executable. The logs I posted above were corresponding to the failed deploy from my GitLab CI runner. You’re right, it does look like it worked based on those logs.
[info] Migrations already up and INFO Main child exited normally with code: 0 make it seem like the script ran fine (there were no migrations for that commit). But the CI runner gave the machine not found error message still.

I haven’t made any progress on this yet, I’m just making the job retry until the deploy succeeds for now

2 Likes

Hi, any news on this issue? seem to be quite a few people in the same situation, thanks!

1 Like

Hey @BrickInTheWall it looks like this is, in fact, a bug in our platform. We’re pretty sure we’ve identified the underlying issue, and we’re working on a fix. I can’t make any promises with regards to timelines, but I’ll write back here with updates when I can.

1 Like

Hello everyone. We’ve rolled out a mitigation for the issue while we look at a better longer term solution. It should make it less noticeable in the most common case.

1 Like