Deployment issues

I have been unable to deploy today. The following errors just happened even though Fly systems are operational again (according to the Status page).

With --local-only:

Update available 0.0.429 -> 0.0.438.
Run "flyctl version update" to upgrade.
==> Verifying app config
--> Verified app config
==> Building image
==> Creating build context
--> Creating build context done
==> Building image with Docker
--> docker host: 20.10.20 linux aarch64
Sending build context to Docker daemon    256kB
[+] Building 1.3s (13/26)
 => [internal] load remote build context                                                                            0.0s
 => copy /context /                                                                                                 0.1s
 => [internal] load metadata for docker.io/library/debian:bullseye-20220801-slim                                    0.3s
 => [internal] load metadata for docker.io/hexpm/elixir:1.14.0-erlang-25.0.4-debian-bullseye-20220801-slim          0.3s
 => [builder  1/15] FROM docker.io/hexpm/elixir:1.14.0-erlang-25.0.4-debian-bullseye-20220801-slim@sha256:cf8adcc0  0.0s
 => [stage-1 1/6] FROM docker.io/library/debian:bullseye-20220801-slim@sha256:a811e62769a642241b168ac34f615fb02da8  0.0s
 => CACHED [stage-1 2/6] RUN apt-get update -y && apt-get install -y libstdc++6 openssl libncurses5 locales   && a  0.0s
 => CACHED [stage-1 3/6] RUN sed -i '/en_US.UTF-8/s/^# //g' /etc/locale.gen && locale-gen                           0.0s
 => CACHED [stage-1 4/6] WORKDIR /app                                                                               0.0s
 => CACHED [stage-1 5/6] RUN chown nobody /app                                                                      0.0s
 => CACHED [builder  2/15] RUN apt-get update -y && apt-get install -y build-essential git     && apt-get clean &&  0.0s
 => CACHED [builder  3/15] WORKDIR /app                                                                             0.0s
 => ERROR [builder  4/15] RUN mix local.hex --force &&     mix local.rebar --force                                  0.8s
------
 > [builder  4/15] RUN mix local.hex --force &&     mix local.rebar --force:
#13 0.798 qemu: uncaught target signal 11 (Segmentation fault) - core dumped
#13 0.815 Segmentation fault
------
Error failed to fetch an image or build from source: error building: executor failed running [/bin/sh -c mix local.hex --force &&     mix local.rebar --force]: exit code: 139


FAIL: 1

Using remote builder:

Update available 0.0.429 -> 0.0.438.
Run "flyctl version update" to upgrade.
==> Verifying app config
--> Verified app config
==> Building image
Remote builder fly-builder-patient-dawn-2568 ready
==> Creating build context
--> Creating build context done
==> Building image with Docker
--> docker host: 20.10.12 linux x86_64
Sending build context to Docker daemon  58.08kB
[+] Building 936.6s (1/1) FINISHED
 => ERROR [internal] load remote build context                                                                    936.6s
------
 > [internal] load remote build context:
------
Error failed to fetch an image or build from source: error building: error during connect: Post "http://[fdaa:0:76d6:a7b:93:acaa:3cbd:2]:2375/v1.41/build?buildargs=null&buildid=upload-request%3Aba0d4bcb5aa838bfe664b0ead9e0a37f7ba63095328731c8d687b0b49e0ad087&cachefrom=null&cgroupparent=&cpuperiod=0&cpuquota=0&cpusetcpus=&cpusetmems=&cpushares=0&dockerfile=&labels=null&memory=0&memswap=0&networkmode=&rm=0&shmsize=0&target=&ulimits=null&version=2": EOF


FAIL: 1

Please note that errors still remain even after I’ve updated flyctl to v0.0.439

3 Likes

It’s been almost a week. Can anyone from Fly team provide any updates?

That first message is QEMU crashing on your local system. Probably because you’re emulating x86 on an arm mac?

That second looks like a timeout connecting to the remote docker Daemon. If it takes a while, it’s likely just slow internet. If it happens quickly it might be an issue with your wireguard connection.

Try running fly doctor -a <app> and see if everything is healthy. Then you can try fly wg reset <org> to get a fresh private network connection going.

fly logs -a fly-builder-patient-dawn-2568 may help, too. In rare circumstances Docker builds can crash the remote daemon. This will sometimes log fast enough to see what happened, but not always.

$ fly doctor -a ims-api-prod                                         [12:03:58]
Testing authentication token... PASSED
Testing flyctl agent... PASSED
Testing local Docker instance... PASSED
Pinging WireGuard gateway (give us a sec)... PASSED

App specific checks for ims-api-prod:
Checking that app has ip addresses allocated... PASSED
Checking A record for ims-api-prod.fly.dev... PASSED
Checking AAAA record for ims-api-prod.fly.dev... PASSED
Oops, something went wrong! Could you try that again?
FAIL: 3

will try again with the reset wireguard.

I’m seeing this too. We have a deployment that’s been running for 47 minutes.

$ fly doctor -a production
Testing authentication token... PASSED
Testing flyctl agent... PASSED
Testing local Docker instance... PASSED
Pinging WireGuard gateway (give us a sec)... PASSED

App specific checks for production:
Checking that app has ip addresses allocated... PASSED
Checking AAAA record for production.fly.dev... PASSED
Oops, something went wrong! Could you try that again?

Also our logs are full of:

2022-12-19T17:20:38Z app[cb604423] iad [info]17:20:38.366 [warning] [libcluster:fly6pn] unable to connect to :"APP@IP"

The “something went wrong” error should be fixed in v0.0.441.

As for the rest, I’m not sure yet.

$ fly version
fly v0.0.441 darwin/arm64 Commit: 7a9ec7be BuildDate: 2022-12-16T17:47:35Z
$ fly doctor -a production
Testing authentication token... PASSED
Testing flyctl agent... PASSED
Testing local Docker instance... PASSED
Pinging WireGuard gateway (give us a sec)... PASSED

App specific checks for production:
Checking that app has ip addresses allocated... PASSED
Checking AAAA record for production.fly.dev... PASSED
Oops, something went wrong! Could you try that again?

after resetting the wireguard, seeing same results.

I was tailing the logs for the builder and just see the following over and over again as it takes FOREVER to build:

2022-12-19T18:46:34Z app[6e82e07a735487] iad [info]time="2022-12-19T18:46:34.628132967Z" level=debug msg="checking docker activity"
2022-12-19T18:46:34Z app[6e82e07a735487] iad [info]time="2022-12-19T18:46:34.628503447Z" level=debug msg="Calling GET /v1.41/containers/json?filters=%7B%22status%22%3A%7B%22running%22%3Atrue%7D%7D&limit=0"
1 Like

My issue was solved by doing a combination of fly wg reset helium (to get a fresh private network connection going) and fly wg websockets enable (apparently helps with some firewall issues I may be having). Hope this helps others!