Could not contact remote node reason: :nodedown. Aborting...

I’ve started getting a strange error today when trying to connect to the IEx remote shell in a fly.io vm:

▲ fly ssh console
Connecting to top1.nearest.of.myapp.internal... complete
/ # ./bin/my_app remote
Erlang/OTP 24 [erts-12.3] [source] [64-bit] [smp:2:2] [ds:2:2:10] [async-threads:1] [jit:no-native-stack]

Could not contact remote node my_app@28862751, reason: :nodedown. Aborting...  

Note that I was able to connect to IEx without any issues until yesterday and didn’t deploy since. After a bit of googling, I landed on:

Following the guide, I added the env.ssh.eex file, but still get the same error, although with a different remote node name:

▲ fly ssh console
Connecting to top1.nearest.of.myapp.internal... complete
/ # ./bin/my_app remote
Erlang/OTP 24 [erts-12.3] [source] [64-bit] [smp:2:2] [ds:2:2:10] [async-threads:1] [jit:no-native-stack]

Could not contact remote node my_app@fdaa:0:544a:a7b:1a:0:f5d5:2, reason: :nodedown. Aborting...  

Update: The issue seems to have been automatically resolved now. In fact, it’s still working after reverting the env.ssh.eex change mentioned above.

Not sure if this was an internal networking issue or something else.

I’m experiencing this issue with my application.

➜  app git:(fly) fly ssh console
Connecting to top1.nearest.of.savvycal-staging.internal... complete
# app/bin/mighty remote
Erlang/OTP 24 [erts-12.1.5] [source] [64-bit] [smp:1:1] [ds:1:1:10] [async-threads:1] [jit]

Could not contact remote node savvycal-staging@fdaa:0:61d4:a7b:9d36:7c1f:f184:2, reason: :nodedown. Aborting...

In the logs:

2022-06-13T16:21:57Z app[7c1ff184] iad [info]Reaped child process with pid: 728, exit code: 0
2022-06-13T16:21:57Z app[7c1ff184] iad [info]Reaped child process with pid: 730, exit code: 0
2022-06-13T16:23:23Z app[7c1ff184] iad [info]Reaped child process with pid: 792, exit code: 0
2022-06-13T16:23:23Z app[7c1ff184] iad [info]Reaped child process with pid: 794, exit code: 0
2022-06-13T16:23:23Z app[7c1ff184] iad [info]Reaped child process with pid: 813 and signal: SIGUSR1, core dumped? false
2022-06-13T16:23:23Z app[7c1ff184] iad [info]Reaped child process with pid: 815 and signal: SIGUSR1, core dumped? false

My env.sh.eex file looks like this:

#!/bin/sh

# See https://fly.io/docs/getting-started/elixir/#naming-your-elixir-node
ip=$(grep fly-local-6pn /etc/hosts | cut -f 1)
export RELEASE_DISTRIBUTION=name
export RELEASE_NODE=$FLY_APP_NAME@$ip

And, I have a cookie configured in mix.exs:

  defp releases do
    [
      mighty: [
        include_executables_for: [:unix],
        # See https://fly.io/docs/app-guides/elixir-static-cookie/
        cookie: "65f7WN8jYgn95iS1JTxgV7pyUvO-uQi96CVEG8FgSdAo-dJEoaaePg==",
        applications: [
          # ...
        ]
      ]
    ]
  end

Which I’ve been able to confirm is being recognized in the release:

# cat app/releases/COOKIE
65f7WN8jYgn95iS1JTxgV7pyUvO-uQi96CVEG8FgSdAo-dJEoaaePg==

I would appreciate any pointers! Happy to provide additional information as needed.

EDIT: This is now solved!

I was missing this part in my Dockerfile that is normally appended by flyctl when initially calling fly launch:

# Appended by flyctl
ENV ECTO_IPV6 true
ENV ERL_AFLAGS "-proto_dist inet6_tcp"

Adding these lines (specifically the ERL_AFLAGS one) solved the error described here. Thanks @brainlid for the excellent troubleshooting :sparkles:

2 Likes

Thanks @derrickreimer for helping to identify the problem. I updated the docs to hopefully help future readers. Does this look good? Would it have helped solve your problem when you first encountered it?

1 Like

This definitely would have solved my problem! Nice improvement.

1 Like

Thanks for the feedback!