Could not contact remote node reason: :nodedown. Aborting...

I’ve started getting a strange error today when trying to connect to the IEx remote shell in a fly.io vm:

▲ fly ssh console
Connecting to top1.nearest.of.myapp.internal... complete
/ # ./bin/my_app remote
Erlang/OTP 24 [erts-12.3] [source] [64-bit] [smp:2:2] [ds:2:2:10] [async-threads:1] [jit:no-native-stack]

Could not contact remote node my_app@28862751, reason: :nodedown. Aborting...  

Note that I was able to connect to IEx without any issues until yesterday and didn’t deploy since. After a bit of googling, I landed on:

Following the guide, I added the env.ssh.eex file, but still get the same error, although with a different remote node name:

▲ fly ssh console
Connecting to top1.nearest.of.myapp.internal... complete
/ # ./bin/my_app remote
Erlang/OTP 24 [erts-12.3] [source] [64-bit] [smp:2:2] [ds:2:2:10] [async-threads:1] [jit:no-native-stack]

Could not contact remote node my_app@fdaa:0:544a:a7b:1a:0:f5d5:2, reason: :nodedown. Aborting...  

Update: The issue seems to have been automatically resolved now. In fact, it’s still working after reverting the env.ssh.eex change mentioned above.

Not sure if this was an internal networking issue or something else.

I’m experiencing this issue with my application.

➜  app git:(fly) fly ssh console
Connecting to top1.nearest.of.savvycal-staging.internal... complete
# app/bin/mighty remote
Erlang/OTP 24 [erts-12.1.5] [source] [64-bit] [smp:1:1] [ds:1:1:10] [async-threads:1] [jit]

Could not contact remote node savvycal-staging@fdaa:0:61d4:a7b:9d36:7c1f:f184:2, reason: :nodedown. Aborting...

In the logs:

2022-06-13T16:21:57Z app[7c1ff184] iad [info]Reaped child process with pid: 728, exit code: 0
2022-06-13T16:21:57Z app[7c1ff184] iad [info]Reaped child process with pid: 730, exit code: 0
2022-06-13T16:23:23Z app[7c1ff184] iad [info]Reaped child process with pid: 792, exit code: 0
2022-06-13T16:23:23Z app[7c1ff184] iad [info]Reaped child process with pid: 794, exit code: 0
2022-06-13T16:23:23Z app[7c1ff184] iad [info]Reaped child process with pid: 813 and signal: SIGUSR1, core dumped? false
2022-06-13T16:23:23Z app[7c1ff184] iad [info]Reaped child process with pid: 815 and signal: SIGUSR1, core dumped? false

My env.sh.eex file looks like this:

#!/bin/sh

# See https://fly.io/docs/getting-started/elixir/#naming-your-elixir-node
ip=$(grep fly-local-6pn /etc/hosts | cut -f 1)
export RELEASE_DISTRIBUTION=name
export RELEASE_NODE=$FLY_APP_NAME@$ip

And, I have a cookie configured in mix.exs:

  defp releases do
    [
      mighty: [
        include_executables_for: [:unix],
        # See https://fly.io/docs/app-guides/elixir-static-cookie/
        cookie: "65f7WN8jYgn95iS1JTxgV7pyUvO-uQi96CVEG8FgSdAo-dJEoaaePg==",
        applications: [
          # ...
        ]
      ]
    ]
  end

Which I’ve been able to confirm is being recognized in the release:

# cat app/releases/COOKIE
65f7WN8jYgn95iS1JTxgV7pyUvO-uQi96CVEG8FgSdAo-dJEoaaePg==

I would appreciate any pointers! Happy to provide additional information as needed.

EDIT: This is now solved!

I was missing this part in my Dockerfile that is normally appended by flyctl when initially calling fly launch:

# Appended by flyctl
ENV ECTO_IPV6 true
ENV ERL_AFLAGS "-proto_dist inet6_tcp"

Adding these lines (specifically the ERL_AFLAGS one) solved the error described here. Thanks @Mark for the excellent troubleshooting :sparkles:

2 Likes

Thanks @derrickreimer for helping to identify the problem. I updated the docs to hopefully help future readers. Does this look good? Would it have helped solve your problem when you first encountered it?

1 Like

This definitely would have solved my problem! Nice improvement.

1 Like

Thanks for the feedback!

As I had mentioned here before. The solution here did not work for me so I am bumping this one with the hopes that I’ll be able to resolve this issue.

Dockerfile

FROM elixir:1.13.4-alpine AS build

RUN apk add --no-cache build-base npm git python3

# prepare build dir
WORKDIR /app

# install hex + rebar
RUN mix local.hex --force && \
    mix local.rebar --force

# set build ENV
ENV MIX_ENV="prod"

# install mix dependencies
COPY mix.exs mix.lock ./
COPY apps/dert_gg/mix.exs apps/dert_gg/mix.exs
COPY apps/dert_gg_web/mix.exs apps/dert_gg_web/mix.exs

RUN mix deps.get --only $MIX_ENV
RUN mkdir config

# copy compile-time config files before we compile dependencies
# to ensure any relevant config change will trigger the dependencies
# to be re-compiled.
COPY config/config.exs config/${MIX_ENV}.exs config/
RUN mix deps.compile

COPY apps/dert_gg/priv apps/dert_gg/priv
COPY apps/dert_gg_web/priv apps/dert_gg_web/priv

COPY apps/dert_gg/lib apps/dert_gg/lib
COPY apps/dert_gg_web/lib apps/dert_gg_web/lib

COPY apps/dert_gg_web/assets apps/dert_gg_web/assets

# Deploy & Digest assets
RUN npm --prefix ./apps/dert_gg_web/assets ci --progress=false --no-audit --loglevel=error && \
    npm run --prefix ./apps/dert_gg_web/assets deploy
RUN mix phx.digest

# Compile the release
RUN mix compile

# Changes to config/runtime.exs don't require recompiling the code
COPY config/runtime.exs config/

COPY rel rel

RUN mix release

FROM alpine:3 AS app

RUN apk add --no-cache openssl ncurses-libs libgcc libstdc++

WORKDIR /app

RUN chown nobody:nobody /app

USER nobody:nobody

COPY --from=build --chown=nobody:nobody /app/_build/prod/rel/dert_gg_web ./

ENV HOME=/app

ENV ECTO_IPV6 true
ENV ERL_AFLAGS "-proto_dist inet6_tcp"

CMD ["bin/server"]

env.sh.eex

#!/bin/sh

# Sets and enables heart (recommended only in daemon mode)
# case $RELEASE_COMMAND in
#   daemon*)
#     HEART_COMMAND="$RELEASE_ROOT/bin/$RELEASE_NAME $RELEASE_COMMAND"
#     export HEART_COMMAND
#     export ELIXIR_ERL_OPTIONS="-heart"
#     ;;
#   *)
#     ;;
# esac

# Set the release to work across nodes.
# RELEASE_DISTRIBUTION must be "sname" (local), "name" (distributed) or "none".
ip=$(grep fly-local-6pn /etc/hosts | cut -f 1)

export RELEASE_DISTRIBUTION=name
export RELEASE_NODE=$FLY_APP_NAME@ip

What I do

╰─$ fly ssh console -s
Update available 0.0.337 -> v0.0.353.
Run "fly version update" to upgrade.
? Select instance: fra.wild-frog-7798.internal
Connecting to [fdaa:0:6e5f:a7b:66:844a:b6af:2]... complete
/ # ./app/bin/dert_gg_web remote
Erlang/OTP 24 [erts-12.3.2.2] [source] [64-bit] [smp:1:1] [ds:1:1:10] [async-threads:1] [jit:no-native-stack]

Could not contact remote node wild-frog-7798@ip, reason: :nodedown. Aborting...

I think that should be changed (note the new $ sign):

export RELEASE_NODE=$FLY_APP_NAME@$ip
2 Likes

Oh, I’m blind. :man_facepalming: Thank you!