We’re having instances where using flyctl to deploy is leaving apps in a mixed version deployment because flyctl seems unaware of existing machines in the app, makes new ones, leaves old ones running. We’re having to manually reconcile each time. We even tried scaling down to 1 and it destroyed the machine on the newer sha. This is a cast iron guarantee of flyctl fully broken. Anyone else? Absolutely astonished!
Same issue here. Two existing (10 month old) machines aren’t being read on deployment, so the deploy is creating two new machines. Happened via GH Action and local deploy. Existing machines (and newly deployed machines) are in IAD
Ours are in IAD too!
Support suggested removing processes = [ "app" ] from the fly toml and redeploying. Trying that now and will report back!
That fixed it!
Thanks for the tip, did they explain why that resolves it?
Having the same issue! ![]()
Also having the same issue. After each release I’ve been manually killing the old machines so I don’t have multiple versions of my app running.
I’m also having the same issue. We have an app in three regions but machines are only being created in IAD and the old machines aren’t being removed. This is blocking us from deploying. I’m a little disappointed that no Fly team member has responded here or updated the platform status.
Nope. The response was quite literally only what I told you, albeit VERY quick (kudos to support for that).
Thanks for flagging this, everyone. We pinpointed a recent change that introduced this bug and have since reverted it. Our status page is updated to reflect this: Fly.io Status - flyctl deploy creating new app instances.
Deployments should once again be updating your existing Machines, not creating new ones (unless you’re using the bluegreen deployment strategy, where this would be expected behaviour).
We’ll take a closer look if anyone’s still seeing the same problem.
As a small side note, the official postmortem for this is now available…
https://fly.io/infra-log/machines-api-duplicates/
(These typically arrive ~7 days after the incident.)