ssh console and checking app says it's down but it's not

ryanwinchester · December 29, 2021, 7:34pm

When I run

fly ssh console

It connects and when I run

./app/bin/myapp pid

It says

--rpc-eval : RPC failed with reason :nodedown

But it’s not because I can access it on the website. What the heck is going on?

user23 · December 30, 2021, 1:13pm

I’m having the same issue

Also, when I run (my app name replaced with myapp

/app/bin/myapp remote

I get

Erlang/OTP 24 [erts-12.1.2] [source] [64-bit] [smp:2:2] [ds:2:2:10] [async-threads:1] [jit]

Could not contact remote node myapp@fdaa:0:376f:a7b:85:3774:b441:2, reason: :nodedown. Aborting...

ryanwinchester · December 30, 2021, 1:25pm

Yes, all the commands that expect a running node will fail with :nodedown…

Yet the application is working… Where is it, really?

jsierles · December 30, 2021, 1:39pm

I’m no Elixir expert, but for this to work you might need to setup clustering, even with just one node.

FrequentFlyer · December 30, 2021, 1:41pm

Plus, I see there’s this bit about cookie:

matthewsteiner · December 30, 2021, 6:58pm

I’ve run into a couple minor issues with differences between guides. I was seeing this same error, and saw in this guide:

https://hexdocs.pm/phoenix/fly.html

to add this line to rel/env.sh.eex:

export ELIXIR_ERL_OPTIONS="-proto_dist inet6_tcp"

This fixed clustering for me!

Criss · January 1, 2022, 6:58pm

Same issue here

Did this (libcluster): Deploy an Elixir Phoenix Application
and this: Setting a Static Cookie for Elixir

and the problem remains. Does anyone know of a workaround for now?

Artur · January 12, 2022, 4:42pm

Hi there,
Same problem here.
I tried:

# app/bin/my_app start_iex
Erlang/OTP 24 [erts-12.0.3] [source] [64-bit] [smp:1:1] [ds:1:1:10] [async-threads:1] [jit]

to start a new process, but the console crashed and closed the ssh connection.

I’ve run printenv and none of the required variables to run iex are set in the environment, nonetheless
the app/bin/my_app sets them before running a command but I could not inspect if the variables are right. I was trying to figure out if the cookie was missing.

So, no luck here trying to connect to a running node nor spinning a new one.
Any help will be appreciated.

kurt · January 12, 2022, 10:31pm

Did you VM crash by chance? That’s the only reason i can think of the console exiting and closing the SSH connection. If you run fly logs you might see an out-of-memory error.

Artur · January 13, 2022, 3:07pm

Yes, the error was out of memory.

Artur · January 13, 2022, 3:15pm

Updates:

libcluster was producing this warning:
```
[warning] [libcluster:fly6pn] unable to connect to :"{app}@{ip}"
```
Setting the RELEASE_COOKIE in secrets make this warning disappear.
Of course this is not a solution since each release will create a new cookie.

Created a fresh new app (thinking I missed some step or config) and still unable to connect to the running node.

app/bin/my_app remote
Erlang/OTP 24 [erts-12.0.3] [source] [64-bit] [smp:1:1] [ds:1:1:10] [async-threads:1] [jit]

Could not contact remote node {app}@{ip}, reason: :nodedown. Aborting...

hiroki_arai · January 14, 2022, 12:49pm

I have same issue.

When I do

app/bin/my_app remote

I get

Could not contact remote node {app_name}@{id}, reason: :nodedown. Aborting...

And Log says

2022-01-14T12:46:23.533 app[11898062] nrt [info]Reaped child process with pid: 749, exit code: 0
2022-01-14T12:46:23.535 app[11898062] nrt [info]Reaped child process with pid: 751, exit code: 0
2022-01-14T12:46:23.537 app[11898062] nrt [info]Reaped child process with pid: 770 and signal: SIGUSR1, core dumped? false
2022-01-14T12:46:24.541 app[11898062] nrt [info]Reaped child process with pid: 772 and signal: SIGUSR1, core dumped? false

Mark · January 14, 2022, 1:31pm

This is why you need to set an explicit cookie.

https://fly.io/docs/app-guides/elixir-static-cookie/

hiroki_arai · January 15, 2022, 3:06pm

Hi, @Mark !

Thank you for creating amazing document! I read Setting a Static Cookie for Elixir guide, and changed my project as it describes, and deployed succesfully, I also confirmed that cookie is set in app/releases/COOKIE.

I am sorry, this might be a stupid question, how do I connect to iex with cookie properly?

After run

fly ssh console

I run

app/bin/my_app start_iex --cookie MYCOOKIE

After 30 seconds, iex started, but it was very clunky, I could hardly type any texts, and shutdown in 10 seconds automatically!

It Logged

2022-01-15T14:55:20.396 app[36a34182] nrt [info]Reaped child process with pid: 721, exit code: 0
2022-01-15T14:55:20.397 app[36a34182] nrt [info]Reaped child process with pid: 722, exit code: 0
2022-01-15T14:55:25.093 app[36a34182] nrt [info][  737.730823] Out of memory: Killed process 509 (beam.smp) total-vm:1766436kB, anon-rss:76804kB, file-rss:0kB, shmem-rss:66188kB, UID:65534 pgtables:468kB oom_score_adj:0
2022-01-15T14:55:25.423 app[36a34182] nrt [info]Main child exited with signal (with signal 'SIGKILL', core dumped? false)
2022-01-15T14:55:25.424 app[36a34182] nrt [info]Stared child process with pid: 570, exit code: 0
2022-01-15T14:55:25.425 app[36a34182] nrt [info]Starting clean up.
2022-01-15T14:55:25.437 app[36a34182] nrt [info]Process appears to have been OOM killed!
2022-01-15T14:55:32.639 runner[36a34182] nrt [info]Starting instance
2022-01-15T14:55:32.687 runner[36a34182] nrt [info]Configuring virtual machine
2022-01-15T14:55:32.689 runner[36a34182] nrt [info]Pulling container image
2022-01-15T14:55:35.334 runner[36a34182] nrt [info]Unpacking image
2022-01-15T14:55:35.345 runner[36a34182] nrt [info]Preparing kernel init
2022-01-15T14:55:35.792 runner[36a34182] nrt [info]Configuring firecracker
2022-01-15T14:55:36.369 runner[36a34182] nrt [info]Starting virtual machine
2022-01-15T14:55:36.647 app[36a34182] nrt [info]Starting init (commit: 7943db6)...
2022-01-15T14:55:36.673 app[36a34182] nrt [info]Preparing to run: `/app/bin/server` as nobody
2022-01-15T14:55:36.707 app[36a34182] nrt [info]2022/01/15 14:55:36 listening on [fdaa:0: ... :2]:22 (DNS: [fdaa::3]:53)
2022-01-15T14:55:37.687 app[36a34182] nrt [info]Reaped child process with pid: 548, exit code: 0
2022-01-15T14:55:39.692 app[36a34182] nrt [info]Reaped child process with pid: 569 and signal: SIGUSR1, core dumped? false
2022-01-15T14:55:39.825 app[36a34182] nrt [info]14:55:39.824 [info] Running MyAppWeb.Endpoint with cowboy 2.9.0 at :::4000 (http)
2022-01-15T14:55:39.829 app[36a34182] nrt [info]14:55:39.826 [info] Access MyAppWeb.Endpoint at http://my_app.fly.dev

I also run

app/bin/my_app remote --cookie MYCOOKIE

but i got

Could not contact remote node {app}@{ip}, reason: :nodedown. Aborting...

I like your podcast btw!

Thanks!

Mark · January 15, 2022, 3:59pm

@hiroki_arai you don’t need to specify the cookie.

Have you check out these instructions?

The message about Could not contact remote node {app}@{ip} makes we wonder if the node naming hasn’t been setup?

You’re close! You’ll get there!

hiroki_arai · January 15, 2022, 4:13pm

Ahh, that’s totally my fault! I really appreciate your quick response!

hiroki_arai · January 16, 2022, 8:59am

In my machine, finally

app/bin/my_app remote

working as it shouold!

The key is to make env.sh.eex like below

#!/bin/sh

ip=$(grep fly-local-6pn /etc/hosts | cut -f 1)
export RELEASE_DISTRIBUTION=name
export RELEASE_NODE=$FLY_APP_NAME@$ip
#export ELIXIR_ERL_OPTIONS="-proto_dist inet6_tcp" <- commented out

Is this dangerous decision?

ryanwinchester · January 16, 2022, 2:50pm

hiroki_arai:

The key is to make env.sh.eex like below

#!/bin/sh

ip=$(grep fly-local-6pn /etc/hosts | cut -f 1)
export RELEASE_DISTRIBUTION=name
export RELEASE_NODE=$FLY_APP_NAME@$ip
#export ELIXIR_ERL_OPTIONS="-proto_dist inet6_tcp" <- commented out

Is this dangerous decision?

This worked for me as well.

So, the fly docs do not have that line:

Getting Started · Fly Docs

However, the Phoenix docs do:

Deploying on Fly.io — Phoenix v1.7.10

So all of us with this issue were probably following the Phoenix docs.

hiroki_arai · January 16, 2022, 5:35pm

Yes, we should follow fly docs for now.

Mark · January 18, 2022, 3:14am

That seems odd… hmm. Glad you got it working though!

Topic		Replies	Views
Could not contact remote node reason: :nodedown. Aborting... Questions / Help elixir	8	2412	July 19, 2022
Can't connect to node Questions / Help elixir	2	344	January 28, 2024
Iex terminal not working - fly ssh console Phoenix elixir	1	393	September 6, 2023
Can't run remote on elixir app Questions / Help elixir	11	962	February 2, 2023
Elixir Remote Console hangs Phoenix elixir	3	558	November 20, 2021

ssh console and checking app says it's down but it's not

Related topics