Hi everyone,
One of my env failed to deploy with the message “Failed due to unhealthy allocations”. I read on other threads that it could be related to a port error, but I doubt it in my case: I took the same configuration as my other environments, which work for them.
This seems more related to a DNS access error, but I’m not sure how to resolve this and the error is not very clear:
2022-08-18T07:19:57Z [info]07:19:57.199 [warning] [libcluster:fly6pn] unable to connect to :"encheres-immo-beta@fdaa:0:33f5:a7b:b9b8:70b9:e805:2"
2022-08-18T07:20:02Z [info]07:20:02.200 [debug] [libcluster:fly6pn] polling dns for 'encheres-immo-beta.internal'
Here are the full logs:
Logs
==> Verifying app config
--> Verified app config
==> Building image
WARN Error connecting to local docker daemon: Got permission denied while trying to connect to the Docker daemon socket at unix:///var/run/docker.sock: Get "http://%2Fvar%2Frun%2Fdocker.sock/_ping": dial unix /var/run/docker.sock: connect: permission denied
Remote builder fly-builder-restless-haze-8424 ready
==> Creating build context
--> Creating build context done
==> Building image with Docker
--> docker host: 20.10.12 linux x86_64
[+] Building 10.1s (0/1)
[+] Building 1.8s (28/28) FINISHED
=> CACHED [internal] load remote build context 0.0s
=> CACHED copy /context / 0.0s
=> [internal] load metadata for docker.io/library/alpine:3.15.3 1.7s
=> [internal] load metadata for docker.io/hexpm/elixir:1.13.4-erlang-24.3.4-alpine 1.7s
=> [build 1/17] FROM docker.io/hexpm/elixir:1.13.4-erlang-24.3.4-alpine-3.15.3@sh 0.0s
=> [app 1/6] FROM docker.io/library/alpine:3.15.3@sha256:f22945d45ee2eb4dd463ed5a4 0.0s
=> CACHED [app 2/6] RUN apk add --no-cache libstdc++ openssl ncurses-libs imagemag 0.0s
=> CACHED [app 3/6] RUN apk --no-cache add msttcorefonts-installer fontconfig && 0.0s
=> CACHED [app 4/6] WORKDIR /app 0.0s
=> CACHED [app 5/6] RUN chown nobody:nobody /app 0.0s
=> CACHED [build 2/17] RUN apk add --no-cache build-base npm git 0.0s
=> CACHED [build 3/17] WORKDIR /app 0.0s
=> CACHED [build 4/17] RUN mix local.hex --force && mix local.rebar --force 0.0s
=> CACHED [build 5/17] COPY mix.exs mix.lock ./ 0.0s
=> CACHED [build 6/17] COPY config/config.exs config/prod.exs config/ 0.0s
=> CACHED [build 7/17] RUN mix deps.get --only prod && mix deps.compile 0.0s
=> CACHED [build 8/17] COPY assets/package.json assets/package-lock.json ./assets 0.0s
=> CACHED [build 9/17] RUN npm --prefix ./assets ci --progress=false --no-audit - 0.0s
=> CACHED [build 10/17] COPY priv priv 0.0s
=> CACHED [build 11/17] COPY lib lib 0.0s
=> CACHED [build 12/17] COPY assets assets 0.0s
=> CACHED [build 13/17] RUN mix assets.deploy 0.0s
=> CACHED [build 14/17] RUN mix compile 0.0s
=> CACHED [build 15/17] COPY config/runtime.exs config/ 0.0s
=> CACHED [build 16/17] COPY rel rel 0.0s
=> CACHED [build 17/17] RUN mix release 0.0s
=> CACHED [app 6/6] COPY --from=build --chown=nobody:nobody /app/_build/prod/rel/e 0.0s
=> exporting to image 0.0s
=> => exporting layers 0.0s
=> => writing image sha256:7f6cb87fc04d56300b4030a494711654f5ebeeeec87f71856c6edaa 0.0s
=> => naming to registry.fly.io/encheres-immo-beta:deployment-1660807007 0.0s
--> Building image done
==> Pushing image to fly
The push refers to repository [registry.fly.io/encheres-immo-beta]
55132ca85db4: Layer already exists
f4a292c3c3d6: Layer already exists
10dc836ba99a: Layer already exists
922c23e78fac: Layer already exists
a5ef1b54ee81: Layer already exists
a1c01e366b99: Layer already exists
deployment-1660807007: digest: sha256:875364c51e687cdb769d3fd3b36b6fd55bf5bad1dc2a727d4cbef131c4cb610a size: 1577
--> Pushing image done
image: registry.fly.io/encheres-immo-beta:deployment-1660807007
image size: 234 MB
==> Creating release
--> release v14 created
--> You can detach the terminal anytime without stopping the deployment
==> Release command detected: /app/bin/migrate
--> This release will not be available until the release command succeeds.
Starting instance
Configuring virtual machine
Pulling container image
Unpacking image
Starting instance
Configuring virtual machine
Pulling container image
Unpacking image
Starting instance
Configuring virtual machine
Pulling container image
Unpacking image
Preparing kernel init
Configuring firecracker
Starting virtual machine
Preparing kernel init
Configuring firecracker
Starting virtual machine
Preparing kernel init
Configuring firecracker
Starting virtual machine
Starting init (commit: 9b0a951)...
UUID=d636a499-882d-4cfd-aabe-f02454a95bf1
Starting init (commit: 9b0a951)...
Setting up swapspace version 1, size = 536866816 bytes
UUID=d636a499-882d-4cfd-aabe-f02454a95bf1
Preparing to run: `/app/bin/migrate` as nobody
2022/08/18 07:15:13 listening on [fdaa:0:33f5:a7b:cbb7:6fc3:8343:2]:22 (DNS: [fdaa::3]:53)
07:15:14.362 [warning] Description: 'Authenticity is not established by certificate path validation'
Reason: 'Option {verify, verify_peer} and cacertfile/cacerts is missing'
07:15:14.362 [warning] Description: 'Authenticity is not established by certificate path validation'
Reason: 'Option {verify, verify_peer} and cacertfile/cacerts is missing'
07:15:14.928 [info] Migrations already up
Starting init (commit: 9b0a951)...
Setting up swapspace version 1, size = 536866816 bytes
UUID=d636a499-882d-4cfd-aabe-f02454a95bf1
Preparing to run: `/app/bin/migrate` as nobody
2022/08/18 07:15:13 listening on [fdaa:0:33f5:a7b:cbb7:6fc3:8343:2]:22 (DNS: [fdaa::3]:53)
07:15:14.362 [warning] Description: 'Authenticity is not established by certificate path validation'
Reason: 'Option {verify, verify_peer} and cacertfile/cacerts is missing'
07:15:14.362 [warning] Description: 'Authenticity is not established by certificate path validation'
Reason: 'Option {verify, verify_peer} and cacertfile/cacerts is missing'
07:15:14.928 [info] Migrations already up
Main child exited normally with code: 0
Reaped child process with pid: 572 and signal: SIGUSR1, core dumped? false
Starting clean up.
Preparing to run: `/app/bin/migrate` as nobody
2022/08/18 07:15:13 listening on [fdaa:0:33f5:a7b:cbb7:6fc3:8343:2]:22 (DNS: [fdaa::3]:53)
Reaped child process with pid: 570 and signal: SIGUSR1, core dumped? false
07:15:14.362 [warning] Description: 'Authenticity is not established by certificate path validation'
Reason: 'Option {verify, verify_peer} and cacertfile/cacerts is missing'
07:15:14.362 [warning] Description: 'Authenticity is not established by certificate path validation'
Reason: 'Option {verify, verify_peer} and cacertfile/cacerts is missing'
07:15:14.928 [info] Migrations already up
Main child exited normally with code: 0
Reaped child process with pid: 572 and signal: SIGUSR1, core dumped? false
Starting clean up.
Main child exited normally with code: 0
Reaped child process with pid: 572 and signal: SIGUSR1, core dumped? false
Starting clean up.
Starting instance
Configuring virtual machine
Pulling container image
Unpacking image
Preparing kernel init
Configuring firecracker
Starting virtual machine
Starting init (commit: 9b0a951)...
Setting up swapspace version 1, size = 536866816 bytes
UUID=d636a499-882d-4cfd-aabe-f02454a95bf1
Preparing to run: `/app/bin/migrate` as nobody
2022/08/18 07:15:13 listening on [fdaa:0:33f5:a7b:cbb7:6fc3:8343:2]:22 (DNS: [fdaa::3]:53)
Reaped child process with pid: 570 and signal: SIGUSR1, core dumped? false
07:15:14.362 [warning] Description: 'Authenticity is not established by certificate path validation'
Reason: 'Option {verify, verify_peer} and cacertfile/cacerts is missing'
07:15:14.362 [warning] Description: 'Authenticity is not established by certificate path validation'
Reason: 'Option {verify, verify_peer} and cacertfile/cacerts is missing'
07:15:14.928 [info] Migrations already up
Main child exited normally with code: 0
Reaped child process with pid: 572 and signal: SIGUSR1, core dumped? false
Starting clean up.
==> Monitoring deployment
1 desired, 1 placed, 0 healthy, 0 unhealthy [restarts: 1] [health checks: 1 total, 1 crit
1 desired, 1 placed, 0 healthy, 0 unhealthy [restarts: 2] [health checks: 1 total, 1 crit
1 desired, 1 placed, 0 healthy, 1 unhealthy [restarts: 2] [health checks: 1 total, 1 crit
1 desired, 1 placed, 0 healthy, 1 unhealthy [restarts: 2] [health checks: 1 total, 1 critical]
Failed Instances
Failure #1
Instance
ID PROCESS VERSION REGION DESIRED STATUS HEALTH CHECKS RESTARTS CREATED
f87db3ff 14 fra run running 1 total, 1 critical 2 7m13s ago
Recent Events
TIMESTAMP TYPE MESSAGE
2022-08-18T07:15:19Z Received Task received by client
2022-08-18T07:15:19Z Task Setup Building Task Directory
2022-08-18T07:15:22Z Started Task started by client
2022-08-18T07:17:08Z Restart Signaled healthcheck: check "a61773ab9e61f7afdefca4f759fca6f9" unhealthy
2022-08-18T07:17:12Z Terminated Exit Code: 0
2022-08-18T07:17:12Z Restarting Task restarting in 1.038306486s
2022-08-18T07:17:19Z Started Task started by client
2022-08-18T07:19:04Z Restart Signaled healthcheck: check "a61773ab9e61f7afdefca4f759fca6f9" unhealthy
2022-08-18T07:19:09Z Terminated Exit Code: 0
2022-08-18T07:19:09Z Restarting Task restarting in 1.16697939s
2022-08-18T07:19:16Z Started Task started by client
2022-08-18T07:19:22Z [info]Reaped child process with pid: 583, exit code: 0
2022-08-18T07:19:25Z [info]07:19:25.092 [debug] Tzdata polling for update.
2022-08-18T07:19:25Z [info]07:19:25.586 [info] tzdata release in place is from a file last modified Tue, 22 Dec 2020 23:35:21 GMT. Release file on server was last modified Tue, 16 Aug 2022 01:15:47 GMT.
2022-08-18T07:19:25Z [info]07:19:25.586 [debug] Tzdata downloading new data from https://data.iana.org/time-zones/tzdata-latest.tar.gz
2022-08-18T07:19:26Z [info]07:19:26.082 [debug] Tzdata data downloaded. Release version 2022c.
2022-08-18T07:19:26Z [info]07:19:26.935 [info] Tzdata has updated the release from 2020e to 2022c
2022-08-18T07:19:26Z [info]07:19:26.935 [debug] Tzdata deleting ETS table for version 2020e
2022-08-18T07:19:26Z [info]07:19:26.938 [debug] Tzdata deleting ETS table file for version 2020e
2022-08-18T07:19:27Z [info]07:19:27.104 [debug] [libcluster:fly6pn] polling dns for 'encheres-immo-beta.internal'
2022-08-18T07:19:27Z [info]07:19:27.117 [warning] [libcluster:fly6pn] unable to connect to :"encheres-immo-beta@fdaa:0:33f5:a7b:b9b8:70b9:e805:2"
2022-08-18T07:19:32Z [info]07:19:32.118 [debug] [libcluster:fly6pn] polling dns for 'encheres-immo-beta.internal'
2022-08-18T07:19:32Z [info]07:19:32.131 [warning] [libcluster:fly6pn] unable to connect to :"encheres-immo-beta@fdaa:0:33f5:a7b:b9b8:70b9:e805:2"
2022-08-18T07:19:37Z [info]07:19:37.132 [debug] [libcluster:fly6pn] polling dns for 'encheres-immo-beta.internal'
2022-08-18T07:19:37Z [info]07:19:37.144 [warning] [libcluster:fly6pn] unable to connect to :"encheres-immo-beta@fdaa:0:33f5:a7b:b9b8:70b9:e805:2"
2022-08-18T07:19:42Z [info]07:19:42.145 [debug] [libcluster:fly6pn] polling dns for 'encheres-immo-beta.internal'
2022-08-18T07:19:42Z [info]07:19:42.159 [warning] [libcluster:fly6pn] unable to connect to :"encheres-immo-beta@fdaa:0:33f5:a7b:b9b8:70b9:e805:2"
2022-08-18T07:19:47Z [info]07:19:47.160 [debug] [libcluster:fly6pn] polling dns for 'encheres-immo-beta.internal'
2022-08-18T07:19:47Z [info]07:19:47.171 [warning] [libcluster:fly6pn] unable to connect to :"encheres-immo-beta@fdaa:0:33f5:a7b:b9b8:70b9:e805:2"
2022-08-18T07:19:52Z [info]07:19:52.171 [debug] [libcluster:fly6pn] polling dns for 'encheres-immo-beta.internal'
2022-08-18T07:19:52Z [info]07:19:52.183 [warning] [libcluster:fly6pn] unable to connect to :"encheres-immo-beta@fdaa:0:33f5:a7b:b9b8:70b9:e805:2"
2022-08-18T07:19:57Z [info]07:19:57.184 [debug] [libcluster:fly6pn] polling dns for 'encheres-immo-beta.internal'
2022-08-18T07:19:57Z [info]07:19:57.199 [warning] [libcluster:fly6pn] unable to connect to :"encheres-immo-beta@fdaa:0:33f5:a7b:b9b8:70b9:e805:2"
2022-08-18T07:20:02Z [info]07:20:02.200 [debug] [libcluster:fly6pn] polling dns for 'encheres-immo-beta.internal'
2022-08-18T07:20:02Z [info]07:20:02.215 [warning] [libcluster:fly6pn] unable to connect to :"encheres-immo-beta@fdaa:0:33f5:a7b:b9b8:70b9:e805:2"
2022-08-18T07:20:07Z [info]07:20:07.215 [debug] [libcluster:fly6pn] polling dns for 'encheres-immo-beta.internal'
2022-08-18T07:20:07Z [info]07:20:07.226 [warning] [libcluster:fly6pn] unable to connect to :"encheres-immo-beta@fdaa:0:33f5:a7b:b9b8:70b9:e805:2"
2022-08-18T07:20:12Z [info]07:20:12.227 [debug] [libcluster:fly6pn] polling dns for 'encheres-immo-beta.internal'
2022-08-18T07:20:12Z [info]07:20:12.241 [warning] [libcluster:fly6pn] unable to connect to :"encheres-immo-beta@fdaa:0:33f5:a7b:b9b8:70b9:e805:2"
2022-08-18T07:20:17Z [info]07:20:17.242 [debug] [libcluster:fly6pn] polling dns for 'encheres-immo-beta.internal'
2022-08-18T07:20:17Z [info]07:20:17.254 [warning] [libcluster:fly6pn] unable to connect to :"encheres-immo-beta@fdaa:0:33f5:a7b:b9b8:70b9:e805:2"
--> v14 failed - Failed due to unhealthy allocations - rolling back to job version 13 and deploying as v15
--> Troubleshooting guide at https://fly.io/docs/getting-started/troubleshooting/
Error abort
Here flyctl status
:
App
Name = encheres-immo-beta
Owner = encheres-immo
Version = 15
Status = running
Hostname = encheres-immo-beta.fly.dev
Platform = nomad
Deployment Status
ID = 9f9c9bc7-84ff-30c2-55e9-2447ff9d0d78
Version = v15
Status = failed
Description = Failed due to unhealthy allocations - not rolling back to stable job version 15 as current job has same specification
Instances = 1 desired, 1 placed, 0 healthy, 1 unhealthy
Instances
ID PROCESS VERSION REGION DESIRED STATUS HEALTH CHECKS RESTARTS CREATED
cc150bd9 app 15 ⇡ fra stop complete 1 total, 1 critical 2 40m23s ago
f87db3ff app 14 fra stop complete 1 total, 1 critical 2 45m22s ago
6c50052c app 13 fra stop complete 1 total, 1 critical 2 12h45m ago
70b9e805 app 9 fra run running 1 total, 1 passing 0 2022-07-28T13:15:36Z
Here flyctl checks list
:
Health Checks for encheres-immo-beta
NAME | STATUS | ALLOCATION | REGION | TYPE | LAST UPDATED | OUTPUT
-----------------------------------*---------*------------*--------*------*----------------------*-------------------------------------------
a61773ab9e61f7afdefca4f759fca6f9 | passing | 70b9e805 | fra | TCP | 2022-07-28T13:16:09Z | TCP connect 172.19.4.98:4000: Success[✓]
| | | | | |
| | | | | |
Thanks for any help or lead