Timeout error with Jekyll multi-stage build

Hello,

I recently stumbled across fly.io from a HN post and decided to give it a try. It’s actually pretty awesome! I’ve been hoping someone would make it easy to spin up container images and do things with them on the fly.

Anyways, I started toying with a personal project for fun. It’s just a static site. I am using a multi-stage container image to practically build the “Hello World” with Jekyll and serve the output with an Nginx image.

Unfortunately, this hits a timeout after 5 minutes. I have to assume that’s because the image I tested was based on Fedora 35 and performing the dnf upgrade -y && dnf install rubygems-jekyll probably took a significant portion of the available time.

I’m still early in the process of understanding your infrastructure but was able to get a little further by discarding the build stage and running that step locally.

Out of curiosity, is there a way to get a log of the build? That might help with debugging issues.

Failing Dockerfile:

# Build Stage
FROM registry.fedoraproject.org/fedora:35 AS builder

RUN dnf upgrade -y && dnf install -y rubygem-jekyll

WORKDIR /tmp/src
COPY ./src/* .

RUN jekyll build

# Container to Run
FROM nginx:1.21-alpine

COPY --from=builder /tmp/src/_site /usr/share/nginx/html

Thanks a lot!

Edit: Cleaned up the Dockerfile

Update:

I’m not sure if the image was cached, or what, but I am now seeing more progress with the multi-stage build posted above. I also see output from the build failure. Unfortunately, it is failing on a dnf upgrade -y which is unusual to me.

This is unrelated to the original issue, but I figure I will post it in case anyone attempts to reproduce the original issue :slight_smile:

==> Validating app configuration
--> Validating app configuration done
Services
TCP 80/443 ⇢ 8080
Remote builder fly-builder-bold-dream-6013 ready
==> Creating build context
--> Creating build context done
==> Building image with Docker
--> docker host: 20.10.12 linux x86_64
Sending build context to Docker daemon  40.17kB
[+] Building 0.4s (7/12)                                                                                                                                  
 => [internal] load remote build context                                                                                                             0.0s
 => copy /context /                                                                                                                                  0.0s
 => [internal] load metadata for docker.io/library/nginx:1.21-alpine                                                                                 0.1s
 => [internal] load metadata for registry.fedoraproject.org/fedora:35                                                                                0.0s
 => CACHED [builder 1/6] FROM registry.fedoraproject.org/fedora:35@sha256:2d697a06d17691e87212cf248f499dd47db2e275dfe642ffca5975353ea89887           0.0s
 => CACHED [stage-1 1/2] FROM docker.io/library/nginx:1.21-alpine@sha256:eb05700fe7baa6890b74278e39b66b2ed1326831f9ec3ed4bdc6361a4ac2f333            0.0s
 => ERROR [builder 2/6] RUN dnf upgrade -y                                                                                                           0.2s
------
 > [builder 2/6] RUN dnf upgrade -y:
------
Error error building: executor failed running [/bin/sh -c dnf upgrade -y]: exit code: 139

The first build was when you ran fly deploy, right? It was probably using our remote builder, you should see a log in your console while that’s happening.

A 5 min timeout makes me think it hung. I’m not super familiar with dnf, does it give you any kind of indication of retries on package downloads or similar?

Can you share your whole Dockerfile by chance? That exit code in your last build is a segfault. Which is weird. :smiley: I don’t know what would be segfaulting there.

Yep, you’ve got it.

It may have hung but usually there is pretty verbose output. It’s just Fedora’s packaging system; pretty similar to what you might see out of apt-get update && apt-get upgrade or similar.

Haha, the snippet in my original post is actually the whole Dockerfile :slight_smile: I have deleted it locally – giving hugo a shot since it’s much leaner on dependencies.

From a quick search, I also saw that error code was related to segfaults. Very weird! I also saw some mentions specific to selinux if that helps at all. I wouldn’t worry too much. I’m not personally blocked at the moment or relying on it for anything critical. Just wanted to share my experience so far.

Thanks a lot!

Hi, while trying to port a Dockerfile that uses Fedora from local to remote builds on Fly, I also stumbled over the same error with “exit code: 139” like @Kurtis did, already at “RUN echo ‘Hi!’” in the simple Dokerfile below.

It turns out that this error occurs with Fedora 35 and newer only. But the same Dockerfile builds fine with Fedora 34 or older!

My minimal Dockerfile, extracted from a multi-stage build:

#ARG BUILDER_IMAGE="amd64/fedora:34"
#ARG BUILDER_IMAGE="amd64/fedora:35"

# https://community.fly.io/t/timeout-error-with-jekyll-multi-stage-build/3746
ARG BUILDER_IMAGE="registry.fedoraproject.org/fedora:34"
#ARG BUILDER_IMAGE="registry.fedoraproject.org/fedora:35"

FROM scratch
FROM ${BUILDER_IMAGE} AS builder

RUN echo "Hi!"

RUN dnf -y update
RUN dnf -y upgrade
#RUN apt-get -y update
#RUN apt-get -y upgrade

RUN echo "Bye!"

This Dockerfile builds fine with Fedora 34 and older:

$ fly deploy --dockerfile Dockerfile.fedora 
==> Verifying app config
--> Verified app config
==> Building image
Remote builder fly-builder-snowy-water-8948 ready
==> Creating build context
--> Creating build context done
==> Building image with Docker
--> docker host: 20.10.12 linux x86_64
Sending build context to Docker daemon  16.62kB
[+] Building 63.7s (9/9) FINISHED                                                                                  
 => [internal] load remote build context                                                                      0.0s
 => copy /context /                                                                                           0.0s
 => [internal] load metadata for registry.fedoraproject.org/fedora:34                                         1.5s
 => [builder 1/5] FROM registry.fedoraproject.org/fedora:34@sha256:c7398ad5453edb06975b9b2f8e1b52c4f93c43715  2.3s
 => => resolve registry.fedoraproject.org/fedora:34@sha256:c7398ad5453edb06975b9b2f8e1b52c4f93c437155f3356e4  0.0s
 => => sha256:c7398ad5453edb06975b9b2f8e1b52c4f93c437155f3356e4ecf6140b6c69921 955B / 955B                    0.0s
 => => sha256:9b7fb207e4fa68ff86744b4963a39ae195c1b1c0948f668b36dafd73487702de 429B / 429B                    0.0s
 => => sha256:307241a058e3f161a7772c8587238f2e1c093c207ea9afaafff3d1a4d1deaf0e 1.32kB / 1.32kB                0.0s
 => => sha256:878a8677dead0e70225056b0624c816077ad74129e66097e7bcbcbea27842778 70.81MB / 70.81MB              0.6s
 => => extracting sha256:878a8677dead0e70225056b0624c816077ad74129e66097e7bcbcbea27842778                     1.6s
 => [builder 2/5] RUN echo "Hi!"                                                                              0.4s
 => [builder 3/5] RUN dnf -y update                                                                          55.1s 
 => [builder 4/5] RUN dnf -y upgrade                                                                          1.8s
 => [builder 5/5] RUN echo "Bye!"                                                                             0.3s 
 => exporting to image                                                                                        2.0s 
 => => exporting layers                                                                                       2.0s 
 => => writing image sha256:81d92d88561d21af23b0883085dc0ea99874d163b9eef3db490735ea8448c873                  0.0s 
 => => naming to registry.fly.io/fedoraNN:deployment-1651561453                                                  0.0s 
--> Building image done
==> Pushing image to fly
The push refers to repository [registry.fly.io/fedoraNN]
5f70bf18a086: Layer already exists 
a49ca4aa5c99: Pushed 
4bf4902510fc: Pushed 
883e787c00d4: Pushed 
deployment-1651561453: digest: sha256:b394b6b075d81f3015462c8b877ac35d362898d80630ca41185357099017c573 size: 1364
--> Pushing image done
image: registry.fly.io/fedoraNN:deployment-1651561453
image size: 554 MB
==> Creating release
--> release v2 created

--> You can detach the terminal anytime without stopping the deployment
==> Monitoring deployment
...

The same Dockerfile fails to build on Fly when using Fedora 35 and newer, already at the trivial “RUN echo…” line:

$ fly deploy --dockerfile Dockerfile.fedora 
==> Verifying app config
--> Verified app config
==> Building image
Remote builder fly-builder-snowy-water-8948 ready
==> Creating build context
--> Creating build context done
==> Building image with Docker
--> docker host: 20.10.12 linux x86_64
Sending build context to Docker daemon  16.62kB
[+] Building 1.6s (5/8)                                                                                            
 => [internal] load remote build context                                                                      0.0s
 => copy /context /                                                                                           0.0s
 => [internal] load metadata for registry.fedoraproject.org/fedora:35                                         1.2s
 => CACHED [builder 1/5] FROM registry.fedoraproject.org/fedora:35@sha256:633c6b70becd9a880ecd91b14f02a22151  0.0s
 => ERROR [builder 2/5] RUN echo "Hi!"                                                                        0.3s
------
 > [builder 2/5] RUN echo "Hi!":
------
Error failed to fetch an image or build from source: error building: executor failed running [/bin/sh -c echo "Hi!"]: exit code: 139

$ 

Dockerfile’s ENTRYOINT or CMD needs to exec a process forever (shouldn’t just quit, save for errors, interrupts). If it doesn’t, then Fly’s init process that drives the Firecracker-managed VM will quit, too.

All of them address the first rule of programs running in Fly.app: when your entrypoint program exits, our init kills the VM and we start a new one. So at the end of the day, something has to keep running “in the foreground”.

From: Running Multiple Processes Inside A Fly.io App

Ref: docker is exited immediately when runs with error code 139 - Stack Overflow

1 Like

Good catch, thank you. In the meantime, I have added the example fly-apps/hello-static as second stage to the multi-stage Dockerfile below.

This confirms that build phase actually works, e.g. https://fedoraNN.fly.dev/os-release returns the file that was copied from the build stage to the Web server in the second stage.

The latter does not exit, but the issue persists with Fedora 35 and newer, as shown earlier with the shorter Dockerfile above.

#ARG BUILDER_IMAGE="registry.fedoraproject.org/fedora:34"
#ARG BUILDER_IMAGE="registry.fedoraproject.org/fedora:35"

ARG BUILDER_IMAGE="amd64/fedora:34"
#ARG BUILDER_IMAGE="amd64/fedora:35"
ARG RUNNER_IMAGE="pierrezemb/gostatic"

FROM scratch
FROM ${BUILDER_IMAGE} AS builder

RUN echo "Hi!"
RUN dnf -y update
RUN dnf -y upgrade
RUN echo "Build done."

RUN echo "check https://fedoraNN.fly.dev/[x|os-release]  Bye."

FROM ${RUNNER_IMAGE} AS webServer
COPY ./public/ /srv/http/
# proof in the pudding :-)
COPY --from=builder /etc/os-release /srv/http/

Note: pierrezemb/gostatic listens on port 8043, thus in fly.toml change internal_port= from 8080 to 8043, and re-launch fly launch --dockerfile Dockerfile.fedora!

Ah, I see. So the builder never runs even a trivial echo with fedora:35 but does with fedora:34… that’s a curious error.

Are you able to view builder logs to see if there are any hints about the failure?

LOG_LEVEL=debug flyctl logs -a fly-builder-snowy-water-8948

Other than that, I think docker-engine (v20.10.12) Fly’s builder uses may have issues with amd64/fedora:35; though, that’s far-fetched since neither Docker’s issue tracker nor changelog contain anything related.

Btw, fedora:35 is missing /bin/su according to github/fedora-cloud/docker/issues/101.