Bluegreen deploy error messages are slightly more helpful!

We’re working on making our bluegreen deployment strategy work more reliably!

TL;DR

When you’re using a bluegreen deployment strategy and have multiple images running, the error message now highlights the following flyctl commands to help you figure out what machines are unwanted:

  • fly machines list displays the running machines and the docker images they’re running
  • fly releases --image displays a list of releases with the image tags available

Full story

Right now, we know that they will occasionally fail and leave your app in a strange state where you may have two releases running at the same time. Fixing this is a primary focus of what we’re working on right now, but in the meantime, we want you to know how to get your app back into a consistent state even if it does require manual intervention.

When a deploy fails , subsequent deploys will also fail as we don’t know what machines to replace.

This outputs the following error (now with a bit more info on how to find the issues and get your app into a deployable state):

Updating existing machines in 'app-name' with bluegreen strategy
Verifying if app can be safely deployed
  Found 2 different images in your app (for bluegreen to work, all machines need to run a single image)
    [x] app-name:deployment-ORNHTAIBZGAUHFEA3B00T4A76M - 22 machine(s) (…machine ids)
    [x] app-name:deployment-BXLOKBEWYREU42JUT77KTXYOZ5 - 21 machine(s) (…machine ids)

  Here's how to fix your app so deployments can go through:
    1. Find all the unwanted image versions from the list above.
       Use 'fly machines list' and 'fly releases --image' to help determine unwanted images.
    2. For each unwanted image version, run 'fly machines destroy --force --image=<insert-image-version>'
    3. Retry the deployment with 'fly deploy'
Deployment failed after error: found multiple image versions

You can check what images are running using fly machines list, and what images are available with fly releases --image.

Generally I approach this by removing the images associated with most recent release (ensuring we’ve still got a healthy release running with the previous one), and then re-running the deploy. Most times, this Just Works™.

Are you using bluegreen deployments?

If you’re using or have used the bluegreen strategy, what problems have you encountered? How can we make the experience better for you?

3 Likes

We recently switched to blue/green deploys and encounter this on our end, are there any updates? I also emailed your support directly with more information. Thanks!

same here, seeing this very frequently. bluegreen works like 1/2 the time for me. This gets triggered when you deploy super frequently (twice within the same 2 minutes for example)