Deploying to Fly via GitHub Action failing

I kicked off my deploy action again and got the same result: fix login issues when out of primary region · kentcdodds/kentcdodds.com@f49314a · GitHub

I was able to get a needed deploy out using --local-only as @nick-brevity suggested so that’s good :slight_smile: Back to the slopes :snowboarder: God speed!

1 Like

Just tried again, and still getting the same issue. Not able to find the image from the fly registry.

For me it really appears like the image is not published after being uploaded.

It appears like the image is correctly being pushed.

Screenshot 2023-01-19 at 23.39.30

But it can not be resolver thereafter.

I did my best to compare names, tags, and IDs, and everything looked to be correctly written out.

We’re still working on this. One thing we’re investigating is whether the “current” Ubuntu runner generates images differently. We’ve managed to replicate some of the errors with different buildkit/oci flags, so it’s possible a default changed.

1 Like

We are using the GitHub runners to build and push the image. Ie. the image should to fully addressable from it’s URL regardless of fly runners.

We may have found the issue and a workaround. It looks like sometime earlier today GitHub updated the ubuntu runner which bumped the version of BuildKit from 0.9.1 to 0.10.0. Images built with this version of BuildKit are returning a 404 when our backend attempts to fetch the image manifest from the target registry before a deployment. This is happening for both registry.fly.io and Docker Hub.

Now, I don’t see anything in the BuildKit 0.10.0 release notes that looks related, but something changed and we need to investigate further to figure out and fix our registry client. Until then, reverting to BuildKit 0.9.1 solves the problem.

Workaround

You can either disable BuildKit or specify a working version to get back on track.

Either comment this out:

    steps:
      -
        name: Set up QEMU
        uses: docker/setup-qemu-action@v2
      -
        name: Set up Docker Buildx
        uses: docker/setup-buildx-action@v2

or change docker/setup-buildx-action@v2 to pass a version like this:

      -
        name: Set up Docker Buildx
        uses: docker/setup-buildx-action@v2
        with:
          version: v0.9.1

We’ll keep ya posted.

7 Likes

I can confirm this workaround worked for me.

Thanks for the investigation! Will stay tuned for when we can remove the workaround :+1:

EDIT: Unfortunately I don’t think it did work. So I was able to start the deploy, but the logs of the actual deploy gave me:

2023-01-20T01:00:49Z runner[bb5a374b] den [info]Pull failed, retrying (attempt #1)
2023-01-20T01:00:50Z runner[bb5a374b] den [info]Pull failed, retrying (attempt #2)
2023-01-20T01:00:50Z runner[bb5a374b] den [info]Pulling image failed

This resulted in a rollback.

Oh now that’s interesting. I was testing that our registry client could actually see the images. And I was able to confirm that dockerd could pull them down. We run containerd on workers which might be having a problem. Thanks for letting us know!

1 Like

Could you give it another go? We aren’t 100% sure that error isn’t a one-off.

I re-ran the last workflow and wasn’t watching the logs so I’m not sure exactly what happened, but it got a rollback: use workaround for github actions on fly · kentcdodds/kentcdodds.com@90b9d4c · GitHub

That change to specifying the buildkit version worked great for me. Thanks @michael

The only change that might look suspicious in the buildkit changes was this one:

Which adds labels to the published docker image, would that affect Fly in any way?

The deploy eventually went through, but it looks like it took longer than expected in one region (maa) which is why the deploy was marked as failed. The maa VM did start eventually though. That’s a separate issue, but at least the image pull worked. Thanks for checking for us!

1 Like

Great to hear. And thanks for the pointer, we’re looking into it.

Hi I can confirm it worked for me by specifying the version! Thanks a lot for your investigation.

1 Like

Also confirming that this worked for me!

2 Likes

The fix also worked for me. Should this be added to the Remix Stacks now to prevent others from running into the same issue @michael @kentcdodds ?

Also works for me - just deployed 5 minutes ago

My own investigation led me to the same change that @mikeglazer identified. For me, adding provenance: false under with: did the trick and allowed me to continue using the newest version of the action.

1 Like

This fix stopped working for us (it worked before) today with:

Download and install buildx
  Error: Cannot find buildx v0.9.1 release

Which is weird because it definitely wasn’t unpublished.

Had the same problem today, I was following @jeyj0’s hint and stumbled on this thread

What I changed is build-push-action’s config by adding provenance: false

- name: 🐳 Docker build
  uses: docker/build-push-action@v3
  with:
    ...
    provenance: false

I don’t really understand the details, but at least I’m unblocked now.

2 Likes