Cannot deploy app since yesterday

I’m trying to deploy my app but it keeps hanging and then timing out with this error:

[1/4] Waiting for 91857597a96483 [worker] to have state: started
Error: timeout reached waiting for machine to started failed to wait for VM 91857597a96483 in started state: Get “https://api.machines.dev/v1/apps/*****/machines/91857597a96483/wait?instance_id=01H7AQYFWCY8MXT8115ZDS11V3&state=started&timeout=60”: net/http: request canceled
note: you can change this timeout with the --wait-timeout flag

This problem started yesterday. On the monitoring logs screen on the dashboard it’s hanging at this line:

2023-08-08T14:23:51.869 runner[91857597a96483] sjc [info] Configuring firecracker

2 Likes

I’ve been getting the same all day - trying to deploy into ams.

I finally got deploying to work by destroying all of my standby machines, and making sure all of the auto-scaling machines had started.

But then trying to deploy again results in the same :

2023-08-08T17:27:51.569 runner[148e270a3562e8] ams [info] Successfully prepared image registry.fly.io/*****:deployment-01H7B30NSN0RY3XQMBA22TER13 (10.70389854s)
2023-08-08T17:27:51.940 runner[148e270a3562e8] ams [info] Configuring firecracker

1 Like

Same issue, just started last night for us.

Updating existing machines in '****' with rolling strategy
  [1/4] Waiting for 683dde2a77e618 [worker] to have state: started
Error: timeout reached waiting for machine to started failed to wait for VM 683dde2a77e618 in started state: Get "https://api.machines.dev/v1/apps/****/machines/683dde2a77e618/wait?instance_id=01H7BBZKNQKS8NY24X0S177FVT&state=started&timeout=60": net/http: request canceled
note: you can change this timeout with the --wait-timeout flag
2 Likes

same here :frowning:

I resolved this by destroying the machine that got stuck, and then redeploying.

restarting the specific machine that froze also works

London:
Error: failed to fetch an image or build from source: error rendering push status stream: received unexpected HTTP status: 500 Internal Server Error

I’m brand new. I think I’m experiencing the same issue. I was able to redeploy by destroying all machines. But the issue returns each time I try to deploy. It’s always the sidekiq one. With the message “Waiting for XXXX [sidekiq] to have state: started”

I’ll see if I can figure out how to restart that one machine as someone suggested above.

I have the same issue like you. Mine also stops at deploying sidekiq app

Still happening for me this morning. Starting or restarting the “stuck” app doesn’t help (I have 8 machines across 4 processes, I’ve tried doing that with all of them.)

I’m going to sign up to the Starter plan and try their email support and see if the support is better there - this is my first month running something live with Fly, it’ll be a real shame to have to move elsewhere, but not being able to deploy for 24 hours is kinda a blocker …

Stttttttttil can’t deploy. And no reply from support.

fly deploy --strategy immediate does seem to work - although be sure about what you’re deploying.

been religiously following this thread :slightly_smiling_face:

1 Like

Hi all, we identified a bug in fly deploy introduced a couple days ago (starting in v0.1.72) causing deploys to incorrectly wait on an app’s standby machines to enter the started state (which will never happen, resulting in a timeout error), when they should properly be skipped.

We are working on publishing a new release with a fix (which will be v0.1.74). A new version of flyctl (v0.1.74) has been released with a fix for this issue.

In the meantime, you can manually downgrade to v0.1.71 (after disabling autoupdate first) to work around this issue:

flyctl settings autoupdate disable
curl -L https://fly.io/install.sh | sh -s -- v0.1.71
Sorry for the inconvenience! I hope this resolves your issues.
1 Like

We just released flyctl v0.1.74 with a fix for this issue. Flyctl will update automatically, or run fly version upgrade to upgrade manually.

3 Likes

Thanks for your reply, I’ll try it out.

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.