Elixir/Phoenix Docker builder timeouts after 10 minutes with an EOF error

Dear Fly.io team, I followed the commands to deploy Elixir/Phoenix using the generated Dockerfile. Initial fly launch worked, the following commands fails:

fly deploy --remote-only
==> Verifying app config
--> Verified app config
==> Building image
Remote builder fly-builder-cold-sunset-7939 ready
==> Creating build context
--> Creating build context done
==> Building image with Docker
--> docker host: 20.10.12 linux x86_64
[+] Building 598.0s (0/1)
ERRO[0603] Can't add file /Users/rsas/elixir/sixemojistories/.elixir_ls/build/_test/lib/ecto/ebin/Elixir.Ecto.UUID.beam to tar: io: read/write on closed pipe
ERRO[0603] Can't close tar writer: io: read/write on closed pipe
 => [internal] load remote build context                                   598.0s
Error failed to fetch an image or build from source: error building: unexpected EOF

I can reproduce the failure every time I run the command.

The missing file is a development artifact and probably shouldn’t be part of the deployment.

Can you post your Dockerfile? Perhaps something there is causing the .elixir_ls dir to be included in the build and can be removed to fix the remote build problem.

# Find eligible builder and runner images on Docker Hub. We use Ubuntu/Debian instead of
# Alpine to avoid DNS resolution issues in production.
#
# https://hub.docker.com/r/hexpm/elixir/tags?page=1&name=ubuntu
# https://hub.docker.com/_/ubuntu?tab=tags
#
#
# This file is based on these images:
#
#   - https://hub.docker.com/r/hexpm/elixir/tags - for the build image
#   - https://hub.docker.com/_/debian?tab=tags&page=1&name=bullseye-20210902-slim - for the release image
#   - https://pkgs.org/ - resource for finding needed packages
#   - Ex: hexpm/elixir:1.13.1-erlang-24.2-debian-bullseye-20210902-slim
#
ARG BUILDER_IMAGE="hexpm/elixir:1.13.1-erlang-24.2-debian-bullseye-20210902-slim"
ARG RUNNER_IMAGE="debian:bullseye-20210902-slim"

FROM ${BUILDER_IMAGE} as builder

# install build dependencies
RUN apt-get update -y && apt-get install -y build-essential git \
    && apt-get clean && rm -f /var/lib/apt/lists/*_*

# prepare build dir
WORKDIR /app

# install hex + rebar
RUN mix local.hex --force && \
    mix local.rebar --force

# set build ENV
ENV MIX_ENV="prod"

# install mix dependencies
COPY mix.exs mix.lock ./
RUN mix deps.get --only $MIX_ENV
RUN mkdir config

# copy compile-time config files before we compile dependencies
# to ensure any relevant config change will trigger the dependencies
# to be re-compiled.
COPY config/config.exs config/${MIX_ENV}.exs config/
RUN mix deps.compile

COPY priv priv

# note: if your project uses a tool like https://purgecss.com/,
# which customizes asset compilation based on what it finds in
# your Elixir templates, you will need to move the asset compilation
# step down so that `lib` is available.
COPY assets assets

# compile assets
RUN mix assets.deploy

# Compile the release
COPY lib lib

RUN mix compile

# Changes to config/runtime.exs don't require recompiling the code
COPY config/runtime.exs config/

COPY rel rel
RUN mix release

# start a new build stage so that the final image will only contain
# the compiled release and other runtime necessities
FROM ${RUNNER_IMAGE}

RUN apt-get update -y && apt-get install -y libstdc++6 openssl libncurses5 locales \
  && apt-get clean && rm -f /var/lib/apt/lists/*_*

# Set the locale
RUN sed -i '/en_US.UTF-8/s/^# //g' /etc/locale.gen && locale-gen

ENV LANG en_US.UTF-8
ENV LANGUAGE en_US:en
ENV LC_ALL en_US.UTF-8

WORKDIR "/app"
RUN chown nobody /app

# Only copy the final release from the build stage
COPY --from=builder --chown=nobody:root /app/_build/prod/rel/sixemojistories ./

USER nobody

CMD /app/bin/server
# Appended by flyctl
ENV ECTO_IPV6 true
ENV ERL_AFLAGS "-proto_dist inet6_tcp"

Two One idea come to mind:

  1. Add .elixir_ls to a .dockerignore file.
    2. Confirm the path in the final COPY command is valid when doing the remote build. (edit: I think it’s fine.)

Updated the dockerignore file. Now I’m getting the timeout, but no additional information

➜ sixemojistories git:(main) ✗ fly launch
An existing fly.toml file was found for app sixemojistories
App is not running, deploy...
Deploying sixemojistories
==> Validating app configuration
--> Validating app configuration done
Services
TCP 80/443 ⇢ 8080
Remote builder fly-builder-cold-sunset-7939 ready
==> Creating build context
--> Creating build context done
==> Building image with Docker
--> docker host: 20.10.12 linux x86_64
[+] Building 0.0s (0/1)
[+] Building 598.0s (0/1)
 => [internal] load remote build context                                   598.0s
Error error building: unexpected EOF

I had the same problem: When you use ElxirLS the deployment fails by just running the getting started guide by davidvanleeuwen · Pull Request #43 · superfly/docs · GitHub

It’s not ElixirLS, but it has to do with Apple M1. @rsas are you running this on an Apple M1?

Workaround for me is to just let Github Actions deploy it.

cc @jsierles

Thanks @noegip! Yes, I am on Apple M1 and your problem looks identical to mine.

Is the Github Actions workaround somewhere documented? Thanks a lot!

It’s a lame workaround fyi, but it allows you to continue working on your project: Continuous Deployment with Fly and GitHub Actions

I have no clue what the problem is, but hopefully someone smarter than me can look into it.

Don’t know if this’ll help but in the thread referred to in the GH link, it suggests using
$ DOCKER_BUILDKIT=1 fly deploy ...
Do you think it’s worth trying with that, if it’s applicable?

Didn’t work for me :frowning:

Didn’t work work for me either. Experimented over all possible options more than 4 hours (different regions, fresh apps from the scratch etc.). One out of 50 times the --remote-only option starts working, in all other cases nothing is happening, no log output, merely the build time increases.

Here is the usual output when nothing happens. It would be great to have some sort of timeout flag, I have to nuke the hanging build with kill -9 other wise:

➜ sixemojistories git:(main) ✗ fly launch
An existing fly.toml file was found for app crimson-darkness-7079
? Would you like to copy its configuration to the new app? Yes
Creating app in /Users/rsas/elixir/sixemojistories
Scanning source code
Detected a Dockerfile app
? App Name (leave blank to use an auto-generated name):
Automatically selected personal organization: Raimondas Sasnauskas
? Select region: fra (Frankfurt, Germany)
Created app broken-surf-6799 in organization personal
Wrote config file fly.toml
? Would you like to deploy now? No
Your app is ready. Deploy with `flyctl deploy`
➜ sixemojistories git:(main) ✗ LOG_LEVEL=debug DOCKER_BUILDKIT=1 fly deploy --remote-only
DEBUG Loaded flyctl config from/Users/rsas/.fly/config.yml
DEBUG determined hostname: "MacBook"
DEBUG determined working directory: "/Users/rsas/elixir/sixemojistories"
DEBUG determined user home directory: "/Users/rsas"
DEBUG determined config directory: "/Users/rsas/.fly"
DEBUG cache loaded.
DEBUG config initialized.
DEBUG initialized task manager.
DEBUG skipped querying for new release
DEBUG client initialized.
DEBUG app config loaded from /Users/rsas/elixir/sixemojistories/fly.toml
==> Verifying app config
--> Verified app config
==> Building image
DEBUG trying remote docker daemon
DEBUG Reporting buildDEBUG --> POST https://api.fly.io/graphql {{"query":"mutation($input: StartSourceBuildInput!) { startSourceBuild(input: $input) { sourceBuild { id } } }","variables":{"input":{"appId":"broken-surf-6799"}}}
}
DEBUG <-- 200 https://api.fly.io/graphql (1.27s) {"errors":[{"message":"StartSourceBuildInput isn't a defined input type (on $input)","locations":[{"line":1,"column":10}],"path":["mutation"],"extensions":{"code":"variableRequiresValidType","typeName":"StartSourceBuildInput","variableName":"input"}},{"message":"Field 'startSourceBuild' doesn't exist on type 'Mutations'","locations":[{"line":1,"column":44}],"path":["mutation","startSourceBuild"],"extensions":{"code":"undefinedField","typeName":"Mutations","fieldName":"startSourceBuild"}},{"message":"Variable $input is declared by anonymous mutation but not used","locations":[{"line":1,"column":1}],"path":["mutation"],"extensions":{"code":"variableNotUsed","variableName":"input"}}]}
DEBUG Failed storing buildDEBUG Trying 'Buildpacks' strategy
DEBUG no buildpack builder configured, skipping
DEBUG result image:<nil> error:<nil>
DEBUG Trying 'Dockerfile' strategy
DEBUG --> POST https://api.fly.io/graphql {{"query":"mutation($input: EnsureMachineRemoteBuilderInput!) { ensureMachineRemoteBuilder(input: $input) { machine { id state ips { nodes { family kind ip } } }, app { name organization { slug } } } }","variables":{"input":{"appName":"broken-surf-6799","organizationId":null}}}
}
DEBUG <-- 200 https://api.fly.io/graphql (2.15s) {"data":{"ensureMachineRemoteBuilder":{"machine":{"id":"6733ae29","state":"starting","ips":{"nodes":[{"family":"v6","kind":"privatenet","ip":"fdaa:0:43a4:a7b:23c4:0:7cb4:2"},{"family":"v6","kind":"public","ip":"2604:1380:4091:3601::7cb4:3"},{"family":"v4","kind":"private","ip":"172.19.2.162"}]}},"app":{"name":"fly-builder-twilight-star-9444","organization":{"slug":"personal"}}}}}
DEBUG checking ip &{Family:v6 Kind:privatenet IP:fdaa:0:43a4:a7b:23c4:0:7cb4:2 MaskSize:0}
Waiting for remote builder fly-builder-twilight-star-9444... connecting ⣾ DEBUG --> POST https://api.fly.io/graphql {{"query":"query ($appName: String!) { app(name: $appName) { id name hostname deployed status version appUrl config { definition } organization { id slug } services { description protocol internalPort ports { port handlers } } ipAddresses { nodes { id address type createdAt } } imageDetails { repository version } } }","variables":{"appName":"broken-surf-6799"}}
}
Waiting for remote builder fly-builder-twilight-star-9444... connecting ⣻ DEBUG <-- 200 https://api.fly.io/graphql (610.69ms) {"data":{"app":{"id":"broken-surf-6799","name":"broken-surf-6799","hostname":"broken-surf-6799.fly.dev","deployed":false,"status":"pending","version":0,"appUrl":null,"config":{"definition":{"kill_timeout":5,"kill_signal":"SIGINT","processes":[],"experimental":{"allowed_public_ports":[],"auto_rollback":true},"services":[{"processes":["app"],"protocol":"tcp","internal_port":8080,"concurrency":{"soft_limit":20,"hard_limit":25,"type":"connections"},"ports":[{"port":80,"handlers":["http"]},{"port":443,"handlers":["tls","http"]}],"tcp_checks":[{"interval":"15s","timeout":"2s","grace_period":"1s","restart_limit":0}],"http_checks":[],"script_checks":[]}],"env":{}}},"organization":{"id":"G1B7JG6wPDBq0I8b8yRl4q9RlQUPkK","slug":"personal"},"services":[{"description":"TCP 80/443 ⇢ 8080","protocol":"TCP","internalPort":8080,"ports":[{"port":80,"handlers":["HTTP"]},{"port":443,"handlers":["TLS","HTTP"]}]}],"ipAddresses":{"nodes":[]},"imageDetails":{"repository":"unknown","version":"unknown"}}}}
DEBUG --> POST https://api.fly.io/graphql {{"query":"mutation($input: ValidateWireGuardPeersInput!) { validateWireGuardPeers(input: $input) { invalidPeerIps } }","variables":{"input":{"peerIps":["fdaa:0:43a4:a7b:163a:0:a:2"]}}}
}
DEBUG <-- 200 https://api.fly.io/graphql (131.13ms) {"data":{"validateWireGuardPeers":{"invalidPeerIps":[]}}}
Waiting for remote builder fly-builder-twilight-star-9444... connecting ⡿ DEBUG Remote builder available, but pinging again in 200ms to be sure
Waiting for remote builder fly-builder-twilight-star-9444... connecting ⣟ DEBUG Remote builder available, but pinging again in 234.601696ms to be sure
Waiting for remote builder fly-builder-twilight-star-9444... connecting ⣯ DEBUG Remote builder available, but pinging again in 268.734918ms to be sure
Waiting for remote builder fly-builder-twilight-star-9444... connecting ⣷ DEBUG Remote builder available, but pinging again in 224.582463ms to be sure
Waiting for remote builder fly-builder-twilight-star-9444... connecting ⣾ DEBUG Remote builder is ready to build!
Remote builder fly-builder-twilight-star-9444 ready
==> Creating build context
--> Creating build context done
DEBUG fetching docker server info
==> Building image with Docker
--> docker host: 20.10.12 linux x86_64
DEBUG buildkitEnabled%!!(MISSING)(EXTRA bool=true)Sending build context to Docker daemon  324.6kB
[+] Building 15.7s (0/1)
 => [internal] load remote build context                                                                                                                         15.7s

You might have better luck with Remote Docker. This looks like it might be a network issue.

If you install Docker, you can run fly deploy --local-only to force it to use local Docker. Would you mind giving that a try?

Thanks @kurt, the problem with --local-only is a known M1 qemu bug in Docker:

➜ sixemojistories git:(main) ✗ fly deploy --local-only
==> Verifying app config
--> Verified app config
==> Building image
==> Creating build context
--> Creating build context done
==> Building image with Docker
--> docker host: 20.10.12 linux aarch64
[+] Building 0.0s (0/1)
[+] Building 1.9s (13/28)
 => CACHED [internal] load remote build context                                                                                                                   0.0s
 => CACHED copy /context /                                                                                                                                        0.0s
 => [internal] load metadata for docker.io/library/debian:bullseye-20210902-slim                                                                                  1.2s
 => [internal] load metadata for docker.io/hexpm/elixir:1.13.2-erlang-24.2-debian-bullseye-20210902-slim                                                          1.2s
 => [builder  1/17] FROM docker.io/hexpm/elixir:1.13.2-erlang-24.2-debian-bullseye-20210902-slim@sha256:02bb3440cc47588433765e1f76cea97c389771a96145df1ead00c639  0.0s
 => [stage-1 1/6] FROM docker.io/library/debian:bullseye-20210902-slim@sha256:e3ed4be20c22a1358020358331d177aa2860632f25b21681d79204ace20455a6                    0.0s
 => CACHED [stage-1 2/6] RUN apt-get update -y && apt-get install -y libstdc++6 openssl libncurses5 locales   && apt-get clean && rm -f /var/lib/apt/lists/*_*    0.0s
 => CACHED [stage-1 3/6] RUN sed -i '/en_US.UTF-8/s/^# //g' /etc/locale.gen && locale-gen                                                                         0.0s
 => CACHED [stage-1 4/6] WORKDIR /app                                                                                                                             0.0s
 => CACHED [stage-1 5/6] RUN chown nobody /app                                                                                                                    0.0s
 => CACHED [builder  2/17] RUN apt-get update -y && apt-get install -y build-essential git     && apt-get clean && rm -f /var/lib/apt/lists/*_*                   0.0s
 => CACHED [builder  3/17] WORKDIR /app                                                                                                                           0.0s
 => ERROR [builder  4/17] RUN mix local.hex --force &&     mix local.rebar --force                                                                                0.6s
------
 > [builder  4/17] RUN mix local.hex --force &&     mix local.rebar --force:
#13 0.615 qemu: uncaught target signal 11 (Segmentation fault) - core dumped
#13 0.621 Segmentation fault
------
Error failed to fetch an image or build from source: error building: executor failed running [/bin/sh -c mix local.hex --force &&     mix local.rebar --force]: exit code: 139

Until this is fixed, the only option I see is too install a Linux VM on my Mac and run Docker locally from there.

It would be nice to understand what prevents Remote Docker building, because it worked a few times.

It looks like network timeouts between you and the Docker image. These are incredibly difficult to debug. What I would do is:

Look at ~/.fly/config.yml and see which region we’re using to connect you to wireguard. You’ll see something like this:

wire_guard_state:
    fizz:
        org: fizz
        name: interactive-Kurts-MacBook-Pro-kurt-fly-io-606
        region: sea

Is that region close to you? If it’s not, that will hurt the performance. We’ve seen some instances of people getting regions that are wildly far away from them.

Also check your directory size with du -sh. You can workaround some of this by sending less data to Docker.

The config shows fra with with endpointip fra1.gateway.6pn.dev which is ~100 miles away from my place.

I deleted the .fly folder, reinstalled flyctl from scratch. Same behaviour, even the connection to the builder takes a while. The dir size is 75M, but the output sends only few kB:

Sending build context to Docker daemon  324.6kB
[+] Building 29.4s (0/1)
 => [internal] load remote build context

Not sure if it’s only me or if all users close to fra region experience this with the Remote Builders.

Is there a way to force a different region for building, say France or Amsterdam?

I have ams region. It’s other part of EU for me and I have some issue from my Macbook M1. I added .elixir_ls to Dockerignore because it showed error with that folder but now I see this:

fly deploy
==> Verifying app config
--> Verified app config
==> Building image
Remote builder fly-builder-black-lake-6794 ready
==> Creating build context
--> Creating build context done
==> Building image with Docker
--> docker host: 20.10.12 linux x86_64
[+] Building 597.6s (0/1)
 => [internal] load remote build context                                                                                                                                        597.6s
[+] Building 597.7s (0/1)
 => [internal] load remote build context                                                                                                                                        597.7s
Error failed to fetch an image or build from source: error building: unexpected EOF

Also I don’t understand why it takes too slow to generate it, it’s simple app and it building almost 600 seconds.

Code is here, it’s simple app for meetup presentation.

GitHub Actions passed when I updated that .dockerignore.

UPDATE:
I changed connection to mobile 5G and it works 10x faster even from M1 macbook. Also I moved from AMS to FRA location of app.