Bluegreen deploy error messages are slightly more helpful!

matthewlehner · May 24, 2024, 5:29pm

We’re working on making our bluegreen deployment strategy work more reliably!

TL;DR

When you’re using a bluegreen deployment strategy and have multiple images running, the error message now highlights the following flyctl commands to help you figure out what machines are unwanted:

fly machines list displays the running machines and the docker images they’re running
fly releases --image displays a list of releases with the image tags available

Full story

Right now, we know that they will occasionally fail and leave your app in a strange state where you may have two releases running at the same time. Fixing this is a primary focus of what we’re working on right now, but in the meantime, we want you to know how to get your app back into a consistent state even if it does require manual intervention.

When a deploy fails , subsequent deploys will also fail as we don’t know what machines to replace.

This outputs the following error (now with a bit more info on how to find the issues and get your app into a deployable state):

Updating existing machines in 'app-name' with bluegreen strategy
Verifying if app can be safely deployed
  Found 2 different images in your app (for bluegreen to work, all machines need to run a single image)
    [x] app-name:deployment-ORNHTAIBZGAUHFEA3B00T4A76M - 22 machine(s) (…machine ids)
    [x] app-name:deployment-BXLOKBEWYREU42JUT77KTXYOZ5 - 21 machine(s) (…machine ids)

  Here's how to fix your app so deployments can go through:
    1. Find all the unwanted image versions from the list above.
       Use 'fly machines list' and 'fly releases --image' to help determine unwanted images.
    2. For each unwanted image version, run 'fly machines destroy --force --image=<insert-image-version>'
    3. Retry the deployment with 'fly deploy'
Deployment failed after error: found multiple image versions

You can check what images are running using fly machines list, and what images are available with fly releases --image.

Generally I approach this by removing the images associated with most recent release (ensuring we’ve still got a healthy release running with the previous one), and then re-running the deploy. Most times, this Just Works™.

Are you using bluegreen deployments?

If you’re using or have used the bluegreen strategy, what problems have you encountered? How can we make the experience better for you?

alan7sage · November 8, 2024, 10:13pm

We recently switched to blue/green deploys and encounter this on our end, are there any updates? I also emailed your support directly with more information. Thanks!

franzwarning · November 16, 2024, 12:04am

same here, seeing this very frequently. bluegreen works like 1/2 the time for me. This gets triggered when you deploy super frequently (twice within the same 2 minutes for example)

Topic		Replies	Views
Improved bluegreen deployments Fresh Produce	1	319	April 25, 2024
bluegreen deployment with 15 machines "Error: failed to launch VM: resource_exhausted: rate limit exceeded"	4	106	October 2, 2024
health-check failing during blue-green deployment elixir	3	263	May 9, 2024
Rolling deployment sometimes leaves old version's machines	2	31	December 18, 2024
Deploy blocked due to "2 different images in your app" machines , registry	1	40	February 20, 2025

Bluegreen deploy error messages are slightly more helpful!

TL;DR

Full story

Are you using bluegreen deployments?

Related topics