'fly deploy' not working as expected

Hello Fly Community :wave: I use fly.io to deploy a NodeJS app. The app uses 2 performance-2x machines. I find it that the fly deploy command is not working as it should most of the time. When invoking the command only one machine seems to get updated while the execution gets stuck in the terminal on the first machine. Below are 2 images to better explain the problem.

  1. The execution of the fly deploy command gets halted and stuck in the terminal. Never getting to the second machine.

  2. The Fly Dashboard is successful at showing that the first machine was re-deployed, but nothing is happening to the second machine and it seems that the re-deploy status did not reach you guys because the terminal is stuck.

Would love to get some feedback on this :grin: sometimes the deploys work as expected, but most of the times they get stuck in this state…

I see a similar problem; it hangs waiting for an instance to become healthy and eventually the process times out:

Run flyctl deploy --config fly.prod.toml --image registry.fly.io/prereview:752001716461960d10a933f3deb2d1cc14f15eff
==> Verifying app config
--> Verified app config
Validating fly.prod.toml
Platform: machines
✓ Configuration is valid
==> Building image
Searching for image 'registry.fly.io/prereview:752001716461960d10a933f3deb2d1cc14f15eff' locally...
Searching for image 'registry.fly.io/prereview:752001716461960d10a933f3deb2d1cc14f15eff' remotely...
image found: img_8y6w4zlzdl0p7rn3

Watch your deployment at https://fly.io/apps/prereview/monitoring

Updating existing machines in 'prereview' with rolling strategy
  [1/4] Updating 148e426cd51589 [app]
  [1/4] Waiting for 148e426cd51589 [app] to have state: started
  Machine 148e426cd51589 [app] has state: started
  [1/4] Checking that 148e426cd51589 [app] is up and running
  [1/4] Waiting for 148e426cd51589 [app] to become healthy: 1/1
  [1/4] Machine 148e426cd51589 [app] update finished: success

  [2/4] Updating 3d8d3d3b309689 [app]
  [2/4] Waiting for 3d8d3d3b309689 [app] to have state: started
  Machine 3d8d3d3b309689 [app] has state: started
  [2/4] Checking that 3d8d3d3b309689 [app] is up and running
  [2/4] Waiting for 3d8d3d3b309689 [app] to become healthy: 0/1
Error: The operation was canceled.

However, I can see in the dashboard that it became healthy, leaving multiple versions running (in this case, the first two instances updated successfully, but the remaining two hadn’t).

A redeploy/new deploy then works fine.

In my case I usually first restart both machines and then do the re-deploy again. However again only 1 machine (usually the one which was not deployed on the first try) gets deployed, leaving the other stuck again. This makes it so that the machines are actually not on the same deploy version :sweat_smile:

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.