Fly Deploy from fly image registry stalling for 20+ minutes

Here is a screenshot of some deploys hanging in CircleCI for 25 minutes. This is not always happening making me think there is something funky going on with deployments.

Not only do we need at least medium machines to use the docker push, its also crushing our CircleCI credit ussages :disappointed_relieved:

(The 3rd deploy that took 4m 17s was also a fly deploy with an image with the same size)

- run:
  name: Deploy to Fly.io
  command: |
    wget -qO- 'https://getfly.fly.dev/linux-x86-64/flyctl.tgz' | tar xz
    ./flyctl auth docker
    docker push registry.fly.io/<< parameters.fly_app_name >>:$CIRCLE_BUILD_NUM
    ./flyctl deploy -i registry.fly.io/<< parameters.fly_app_name >>:$CIRCLE_BUILD_NUM -a << parameters.fly_app_name >> --config << parameters.config_path >> --detach

Will get this checked out, thanks for letting us know.

1 Like

Thanks, it seems it’s taking a very long time to upload, and from the logs it seems to halt uploading for good amount of times before continuing again mid upload.

It just took over 6 minutes to upload 575mb while watching the logs.

I also just deployed like 6 apps from the same repo - all the same image size, some take 1 minute, and most take 8-20 mins.

Yeah
 unfortunately this is probably one of those problems that depends on circumstances, like the surrounding network and disk usage, and likely the network hops between CircleCI and Fly as well, so lots of moving parts here.

I’ll keep and eye on this and maybe do a status script that keeps the registry honest
 but just to set expectations right, even if the Fly registry is running at full speed you might not always see peak performance because the bandwidth depends on both systems (CircleCI and Fly) and the network conditions between them.

Gotcha, I wish the fly remote builders were more stable and faster to build on so we could move some of the final build processes to the local fly network, but whenever using fly builders it gets way worse and deployment success is pretty spotty.

Yeah, we are looking into improving remote builder stability as well. My personal trick is to put the builder app in a corner of the world that’s relatively quiet (I use MAA), but of course you shouldn’t have to do this yourself.

Interesting, we are also working on getting our image sizes down, but regardless I wanted to bring this up.

We also recently “just to get it working” moved from node:alpine to node:latest which bloats the image quite a bit because of internal dns failures connecting to postgres.

Any suggestion for the best possible image for running node apps on fly that need to connect to internal fly apps like postgres?

Let me ask around. Alpine seems to have known bugs on Fly so that’s definitely out, but just :latest does tend to be gigantic and bloated. With node specifically you can’t do a separate build step and copy an executable to a scratch type minimal image, so this is a little difficult. The -slim image is a bit smaller, but will need testing for sure. It might work well if you have no native compilation happing in any of your node_modules that uses a package that’s stripped out.

1 Like

Sounds good, that would br great.

Also, we are currently doing mostly “everything” on circle before deploying to Fly (ie. building typescript, generating prisma client, etc) then finally building the docker image and pushing it to your registry before deploying that registry. Should we look into moving these build commands into the dockerfile using multi-stage builders (build vs runtime)?

Multi stage builders can be a huge time and space saved depending on how often the content of each stage changes. If your Prisma client changes only once a month, for example, a multi stage builder will cache that step automatically for you without repeating it on every build. A specially layered single stage builder would also cache this, but with a multi stage you would not even have the prisma generator on the final container, or any of your development node_modules — which I’m guessing you probably do now.

Right! I am working on this at the moment:

## Builder Image ##
FROM node:12-alpine AS BUILD_IMAGE

RUN apk update && apk add curl bash python make g++ && rm -rf /var/cache/apk/*

RUN curl -sfL https://install.goreleaser.com/github.com/tj/node-prune.sh | bash -s -- -b /usr/local/bin

WORKDIR /usr/src/app

COPY package.json yarn.lock ./

RUN yarn --frozen-lockfile

COPY . .

RUN yarn global add tsc prisma

RUN prisma generate
RUN yarn build

RUN npm prune --production

RUN /usr/local/bin/node-prune

## Runtime Image ##
FROM node:slim

WORKDIR /app

COPY --from=BUILD_IMAGE /usr/src/app/dist ./dist
COPY --from=BUILD_IMAGE /usr/src/app/node_modules ./node_modules

CMD ["node", "./dist/rest.js"]

Thoughts?

The build image can be as big as it needs to be, so I think using Alpine doesn’t serve much of a purpose and may cause issues later with missing or different packages. You might not need to add as many packages immediately as well — having them arrive in the normal :latest image will be much faster then installing them with apt after pulling Apline.

The rest of it looks about right, though. Generate code, prune, copy to the run container.

1 Like