RabbitMQ clustering can't work

The Fly toml configuration:

# fly.toml app configuration file generated for rabbitmq-dv on 2024-10-18T23:35:59+08:00
#
# See https://fly.io/docs/reference/configuration/ for information about how to use this file.
#

app = 'rabbitmq-dv'
primary_region = 'cdg'

[build]
  image = 'registry.fly.io/rabbitmq-dv:latest'

[[services]]
  protocol = 'tcp'
  internal_port = 5672
  ports = []
  auto_stop_machines = "off"
  auto_start_machines = true
  min_machines_running = 0

  [[services.tcp_checks]]
    interval = '15s'
    timeout = '1m0s'
    grace_period = '1s'

[[services]]
  protocol = 'tcp'
  internal_port = 15672
  auto_stop_machines = "off"
  auto_start_machines = true
  min_machines_running = 0

  [[services.ports]]
    port = 15672
    handlers = ['tls', 'http']

  [[services.tcp_checks]]
    interval = '15s'
    timeout = '1m0s'
    grace_period = '1s'

[[services]]
  protocol = 'tcp'
  internal_port = 4369
  auto_stop_machines = "off"
  auto_start_machines = true
  min_machines_running = 0

  [[services.ports]]
    port = 4369
    handlers = ['tls']

  [[services.tcp_checks]]
    interval = '15s'
    timeout = '1m0s'
    grace_period = '1s'

[[services]]
  protocol = 'tcp'
  internal_port = 25672
  auto_stop_machines = "off"
  auto_start_machines = true
  min_machines_running = 0

  [[services.ports]]
    port = 25672
    handlers = ['tls']

  [[services.tcp_checks]]
    interval = '15s'
    timeout = '1m0s'
    grace_period = '1s'


[[vm]]
  memory = '512mb'
  cpu_kind = 'shared'
  cpus = 1

The docker configuration:

FROM rabbitmq:3.12.14-management
# See: https://www.rabbitmq.com/docs/clustering#community-docker-image-and-kubernetes
ENV RABBITMQ_ERLANG_COOKIE="123456"
COPY ./prod.conf /etc/rabbitmq/rabbitmq.conf
RUN rabbitmq-plugins enable rabbitmq_management

The RabbitMQ configuration:

listeners.tcp.default = 5672
default_user = admin
default_pass = admin
log.console = true
log.console.level = debug
management.tcp.ip = ::
cluster_formation.peer_discovery_backend = dns
cluster_formation.dns.hostname = rabbitmq-dv.internal

When I deploy to one machine, it works well. When I deploy to 3 machines, they cannot connect to each other to form a cluster.

If we want to use RabbitMQ services deployed through fly.io in a production environment, HA (High Availability) is a very important issue.

I would be very grateful if someone could provide some guidance.

Did you configure your RabbitMQ’s management.tcp.ip = ::

From General to App not working

Yes. Sorry, I forgot to paste my rabbitMQ configuratin.

listeners.tcp.default = 5672
default_user = admin
default_pass = admin
log.console = true
log.console.level = debug
management.tcp.ip = ::
cluster_formation.peer_discovery_backend = dns
cluster_formation.dns.hostname = rabbitmq-dv.internal

Hmm sorry I can’t provide any assistance beyond that since I don’t use rabbitmq.
You may want to mount a persistent volume on the RABBITMQ_MNESIA_DIR though, that probably won’t fix the cluster issue but it would help when one them does go down.
Good luck!

1 Like

I admittedly don’t use RabbitMQ, either, but…

I think I would try static configuration for now, even if that’s not ideal in the long run.

The node names in your screenshot look very odd for Erlang (e.g., rabbit@fly-local-6pn). I don’t know if the DNS infrastructure is really intended for reverse lookups…

The Elixir documentation gives an example of a working cluster—although that’s not the only way to do things. (Probably you want to use <machine_id>.vm.<appname>.internal instead of numeric literals, for example.)

(It’s funny how clustering is categorized under “The Basics” in the Elixir part of the site, incidentally.)

1 Like

I do have some more configuration flags to force ipv6. I’m not sure what of these are necessary, since I only got RabbitMQ running by trial and error, but I ended up with the following additional lines in my Dockerfile:

ENV RABBITMQ_CTL_ERL_ARGS="-proto_dist inet6_tcp --longnames"
ENV RABBITMQ_USE_LONGNAME=true

in erl_inetrc:

{inet6,true}.

and in the RabbitMQ config (I do have the listeners.tcp.default commented out for some reason I don’t remember):

listeners.tcp.1 = :::5672

and in my entrypoint.sh, since I needed to set the hostname:

# Set the RABBITMQ_SERVER_ADDITIONAL_ERL_ARGS environment variable
export RABBITMQ_SERVER_ADDITIONAL_ERL_ARGS="-kernel inetrc '/etc/rabbitmq/erl_inetrc' -proto_dist inet6_tcp -name rabbit@${HOSTNAME}.vm.<appname>.internal -setcookie ${ERL_COOKIE}"
2 Likes

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.