Can't get basic phoenix app clustering to work from tutorial

I’m following the guide here and it’s outdated on a number of accounts. I finally manage to get an application up and running which spins up 2 machines by default (not what the tutorial indicates). I had to upgrade out of the free plan just to get through the tutorial.

I went through and added lib cluster as instructed. I added the cookie thing to the fly.toml file and have added the lib cluster configuration exactly as indicated in the guide. My logs are chock full of lines like this:

2023-09-25T15:49:42Z app[328744d5b41478] bos [info]15:49:42.868 [warning] [libcluster:fly6pn] unable to connect to :"lfgheroes@fdaa:3:1dfd:a7b:ea:23f1:6291:2"

The cluster does not automatically form as the guide indicates. I can remote shell into one of the nodes and manually do Node.connect(...) to the other node’s Node.self() output and that updates the Node.list on both nodes… so it looks like it’s working.

but the lib cluster debug log just continues to spam that connection failure message.

So, to recap:

  • two nodes up and running, they seem to have connectivity to each other
  • nodes do not automatically form cluster as per the tutorial, I had to do it manually with Node.connect
  • lib cluster debug spam continues to warn about a connectivity failure

They may have changed the format for node names recently:

https://community.fly.io/t/env-sh-eex-and-what-to-do-with-it/14349

That post shows…

<APP_NAME>-<IMAGE_REF>@<IP>

as the format for RELEASE_NODE, whereas the name in your log snippet follows the older <APP_NAME>@<IP> structure.

https://community.fly.io/t/elixir-phoenix-libcluster-unable-to-connect-anymore/5845

A reply by chrismccord (of Fly) in the July 2023 thread said that it was to ensure “unique[ness] across deployments”; maybe wires got crossed there?