Could not reserve resource for machine: insufficient CPUs available to fulfill request (WAW)

For several days now I am getting this error most of the time when I’m trying to deploy the app:

Error: failed to update VM 3287961a143685: aborted: could not reserve resource for machine: insufficient CPUs available to fulfill request

This is getting ridiculous to be honest. These problems keep appearing constantly. Right now we are using it for DEV envs and already planning move to DO because of reliability problems and not something we want to deal with production deployment.

1 Like

Hi @rafales yes this sucks; sorry. We have a few hosts that are at capacity and if you have a stopped Machine on one of those, when you deploy there may not be the resources to start it up. We’re working on several fronts to improve this experience.

If I can I ask: are you using auto-stop on your app? If you are hitting this when you are simply trying to update an image on a Machine that is already running, this is a problem we don’t know about.

Hey. No, I am not. But the default deployment strategy for apps seems to be keeping old machines around for the next deployment and unpauses them during deployment, then pauses previous ones. And so on. So there are plenty old machines there.

I also noticed that v2 apps do not re-create destroyed machines. It doesn’t really do what I would expect which is making sure that the app is running.

I’m having the same issue with fresh app when redeploying (same region, waw)

@rafales @mkozak Thanks! I appreciate the responses. Can you confirm whether your fly.toml contains a section like

[deploy]
  strategy = "bluegreen"

(or “rolling”, or “canary”?)

And that your defined services don’t include a line like the following:

[[services]]
  ...
  auto_stop_machines = true
  ...

@catflydotio
I do not have bluegreen strategy and I do have auto_stop_machines se to true. Should I change both?

@mkozak Don’t change your deployment strategy (unless you want to – but today may not be the best day). I asked in order to help me understand under what circumstances you might hit the error.

Normally you shouldn’t have a problem regardless of these settings. This won’t happen at all on most hosts, even now!

Right now it seems that if you have Machines on an overloaded host and they have auto-stopped (due to low load), then when your deployment tries to start them up again there may not be enough resources to do so. This is not good, but it makes some sense.

Was this the first deployment?

Actually my config has no strategy set, which I believe means it will default to rolling.
image

I see that I have auto_rollback set, which maybe is a reason why old machines are kept around.

No, this wasn’t the first deployment, I do deploy multiple times a day sometimes.

I guess these issues will eventually go away when you add more capacity in this region? I had similar issues many months ago and stopped using fly, but I see there are still the same issues :frowning:

This isn’t a regional capacity issue. Perhaps the only reason you may be happy about that is that solving it doesn’t primarily involve waiting for new hosts to be delivered to a datacentre.

There are several layers to making this experience better, and one of them was applied today:

Updating to flyctl v0.1.172 should unstick deployment for apps without volume storage. Now, if a host doesn’t have capacity to update a Machine in place on deployment, flyctl will attempt to replace that Machine with a new one on a different host.

1 Like

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.