Machine stuck in replacing state

Have you tried to clone machine/machines and destroy old?

Cloning is stuck with

Cloning machine 148ed192ae1708 into region sin
Provisioning a new machine with image registry.fly.io/5m-vm@sha256:ffe2d6ea1f4b8f075303e49e959112cf9b440f8b53e19a4b68e535ba80b4460eā€¦
Machine 568344ea64958e has been createdā€¦
Waiting for machine 568344ea64958e to startā€¦

1 Like

Well, until this has been resolved, no matter how many times you destroy/re-create the machine youā€™ll eventually replicate this. Waiting for minutes to an hour to complete a deployment is a deal breaker for fast-moving projects. I now have doubts to host our production application in Fly.io due to this very experience.

2 Likes

+1 on this one. It intermittently fails

The information on the status page and dashboard is inaccurate adding the frustration.

Additional note:
I also hosting on SIN region
My Docker image is below 500MB, which shouldnā€™t causing slow pulling

1 Like

Itā€™s true, I had many problems with deploy on fly.io

1 Like

I have observed that during deployment via GitHub Actions, the build process may fail due to the timeout configuration on GitHubā€™s platform, even though the deployment to the Fly machine is still in progress. This delay often occurs due to the extended time taken for image pulling. While GitHub Actions might display a failed status, it is possible that your application has been successfully deployed on Fly. To accurately ascertain the deployment status, monitoring the logs on the Fly dashboard proves to be the most reliable referenceā€¦ for now.

1 Like

Iā€™m not deploying via GitHub and I had same issues

1 Like

anything? People canā€™t deploy app, me too! What we can do? Every deploy causes app down. I need to clone old machines and destroy previous, but I canā€™t deploy new version of app! 4 days have already passed and itā€™s still not working!

Do you have any solution? How can I deploy my app now?

I deleted the entire app and recreated it from scratch, and cloning, restarting, etc, only work intermittently. Something mess up I guess in my app Node. Luckily, itā€™s a brand new app, but if this happens again, I consider moving.

1 Like

I think itā€™s not problem with app (I have node js app too) now I can deploy only using local build with timeout flag set to 3600ā€¦ and deploy takes 20 minutes but it works

One of my websites is down because of this issue. Tried to clone the machine. The cloned machine is also stuck at ā€˜createdā€™ state. Very frustating.

2 Likes

I am having similar issue and my apps are hosted in the sin region.

1 Like

same here sin region is so broken lately itā€™s quite frustrating :frowning:

2 Likes

Not sure how useful this is, but hereā€™s the log:

error.message="machine is in a non-startable state: created" 2023-11-02T14:42:00Z proxy[32874977b90e85] sin [error]request.method="GET" request.id="01HE87Q22M49MH0VD2B61XSK6V-sin"

and another one:

proxy sin [error]could not find a good candidate within 90 attempts at load balancing. last error: no known healthy instances found for route tcp/80. (hint: is your app shut down? is there an ongoing deployment with a volume or are you using the 'immediate' strategy? have your app's instances all reached their hard limit?)

hey @benbjohnson

I experience this replacing stuck issue again

 2023-11-07T14:56:50.740 runner[17811610b403e8] sin [info] Pulling container image registry.fly.io/terbike:deployment-01HEN4HFMBZQPF0JJWJ0BDH0HW 

Flyā€™s status page detect no issue, I think Fly need to add this issue to the metrics.

Thanks!

hey @JP_Phillips @benbjohnson

I am still experiencing replacing issue after a few deployments on this machine 17811610b403e8, is there any plan to fix this? especially for sin region.

Thanks!

1 Like

It looks like I am having the same issue. My app canā€™t deploy because it is stuck on Pulling container image registry.fly.io/.... I am trying to deploy to the waw region though.

I am having this issue today.
Very upsetting, as my website is now down.
I tried cloning the machine, this did not work and had the same result as the other posters.
I might just delete the app and then redeploy from scratch (it is a good thing my database is a separate app and my secrets are configured to redeploy with the app).

Update: I deleted the app successfully. Then, I tried to fly launch an app with the same name and Iā€™m getting ā€œError server returned a non-200 status code: 504ā€
I am currently trying to deploy in Chicago. I will now try to deploy to another location instead.

Update: I tried to deploy in Texas, and I got the same 504.
image
It looks like Iā€™m SOL and Iā€™ll try again tomorrow.

Update: They resolved the issue within a few hours and I was able to re-deploy. Now I must wait 1h for my DNS server to propagate the IP address change (from deleting the app, making a new one) and then I suppose things will be working as normal.

Same issue here! Seems to be stuck replacing O_O