Flaky deployment fails

nickluger · May 23, 2022, 9:11am

Every X deployments or so, the remote deployment (--remote-only) fails with the following error:

Error failed to fetch an image or build from source: error connecting to docker: failed building options: Validation failed: Name has already been taken

Deleting the remote worker VM and forcing Fly to create a new one solves this problem, but this behavior impairs a smooth CI/CD workflow.

We are in fra region.

jsierles · May 23, 2022, 11:21am

Hmm, this should not be related to running VMs. It looks more likely to be an issue with wireguard peers, which are generated anew each time you deploy from a CI service. Which one are you using?

nickluger · May 25, 2022, 12:17pm

We’re using SemaphoreCI, cause it’s running on quite powerful machines.

But the deployment command is just something like:

fly deploy . --config services/nlp/fly.toml  --remote-only  --build-arg TURBO_TEAM=$TURBO_TEAM --build-arg TURBO_TOKEN=$TURBO_TOKEN

I found out that it fails, if one or multiple previous deployments failed and you retry without deleting the remote worker manually beforehand.

nickluger · June 2, 2022, 10:38am

Destroying the builder app before every deployment is a general solution to prevent flaky deployments due to this or these “volume-out-of-space” issues.

On CI this can be done with a script:

#!/bin/bash
FLY_BUILDERS=$(fly apps list | grep -i 'fly-builder-')

if [ -z "$FLY_BUILDERS" ]; then
  echo "No Fly builders found to destroy. Exiting."
  exit 0
fi

while IFS= read -r line; do
  BUILDER=$(echo $line | cut -d' ' -f1)
  echo "Destroying \"$BUILDER...\""
  fly apps destroy $BUILDER --yes
done <<<"$FLY_BUILDERS"

If there’s a more elegant solution, please tell me.

Topic		Replies	Views
Error failed to fetch an image or build from source: error connecting to docker: failed building options: Validation failed: Name has already been taken	1	346	July 1, 2022
Unreliable deploys with Github Actions	16	1366	December 13, 2022
Requesting recommendations for making Fly less flaky for CI	2	389	September 13, 2022
Deploys are still often failing	17	1668	October 6, 2022
Deployment getting increasingly flaky	3	393	February 22, 2022

Flaky deployment fails

Related topics