Deploying infrastructure

We use Fly in a way, where we usually deploy several projects at the same time. Often it fails. It seems like it is the remote builder that fails often, and it indicates, that Fly really isn’t suited to deploy more than a single app at the time.

Are there any tricks that I oversaw? Or is it just time for us to move on, as the number of applications increases?

Currently the fail is: Error failed to fetch an image or build from source: error connecting to docker: failed building options: Validation failed: Pubkey has already been taken, Name has already been taken

Github Actions require a lot of network setup to use our builders. What you’re seeing is times when it’s slow to create a wireguard peer (flyctl needs a wireguard peer if it has never connected before). Since each github action run is a new flyctl instance, it needs to create a new peer.

That error looks like a race creating new peers, which is an interesting problem that I haven’t seen before. I’m guessing multiple simultaneous gh actions runs are the cause.

The best way to use GitHub actions is actually to use their Docker daemon + Docker caching. They make this hard to setup, but it will work very well for what you’re doing. Here’s an example of how to configure it: indie-stack/deploy.yml at main · remix-run/indie-stack · GitHub

1 Like

Ah, we were also experiencing API slowness this morning. Apparently we have seen this error before, and it happens when our API is under heavy load.

That Github Action Docker setup I linked is still a very good idea, but if you give it a few minutes your builds might work again.

Thanks! I will try that instead.

Yeah, the degraded service is also a thing, it also happens quite often which, again, makes it hard to manage bigger infrastructure projects and wanting somewhat atomic deploys.

Anyhow, it doesn’t seem like Fly is the solution to more than a couple of applications. Again, thanks!

Can you post when you get deploy failures here please? I understand why they would have been failing in the last few hours, but I’d like to know what other issues you’re seeing. We have people deploying apps thousands of times per hour so I don’t think we can chalk it up to “not great for more than a few apps”.