Deployment and rollback failed with: Failed due to unhealthy allocations

fkrauthan · March 1, 2023, 8:48pm

Hey,

Our app currently fails to deploy with the error: “Failed due to unhealthy allocations” this also was the error when it tried to rollback. Can some ASAP look into this? I assume this is a platform issue?

It’s the *-dev-backend machine.

Thanks,
Florian

fkrauthan · March 1, 2023, 9:54pm

Based on some more looking around this very much looks like previous reported issues where a VM might not have been shutdown correctly and because of volume the new VM can’t start up?

I definitely need someone from Fly to take a look at this as this is currently blocking dev deployments which in return blocks important prod deployments…

fkrauthan · March 1, 2023, 10:09pm

New update (not sure if someone from Fly did something) but after multiple revert attempts I assume it finally is up and running again.

There is definitely something strange going on here.

kurt · March 1, 2023, 10:18pm

This looks like a delay in scheduling caused by temporary capacity issues. The host your volume is attached to had a burst of usage, when you deployed it stopped the previous VM, but then couldn’t reserve space to start the new VM. After some time, the capacity pressure cleared and Nomad was able to start a new VM.

This is a rough edge case for Nomad apps. There are two ways you may be able to workaround this problem:

Run two VMs + Volumes at all times. If you care about uptime for your application, you should run 2+ instances. If you can tolerate issues like this, one instance is fine
Run a Machine based app instead. The way machines are architected mitigates this a little. Updating an existing Machine doesn’t do the whole capacity dance, it just restarts. In this particular situation, a Machine would have updated just fine. That’s not always true, the first bullet is still the most reliable.

As an aside, when you need help with a specific app, the forums may not work well. We don’t see every thread here. For support for apps you care about, the launch plan + email support will work a lot better.

fkrauthan · March 1, 2023, 10:22pm

Is there an easy way to migrate a nomad app to a machine app? or would I have to delete the nomad app and then re-create the new app?

It also looks like the issue is not gone? We tried re-deploy the new version and are again stuck with the same error by the looks of it.

Topic		Replies	Views
deployment Failed due to unhealthy allocations Phoenix elixir	4	490	September 28, 2021
Help with deploy error: Failed due to unhealthy allocations Questions / Help	7	799	April 25, 2023
Deployment Errors - Failed due to unhealthy allocations Build debugging	9	1767	October 10, 2021
Failed due to unhealthy allocations - rolling back Questions / Help	8	1904	January 11, 2023
"Failed due to unhealthy allocations"	1	283	January 24, 2023

Deployment and rollback failed with: Failed due to unhealthy allocations

Related topics