RabbitMQ on Fly.io

Hey All!

Trying to run a docker container with RabbitMQ (rabbitmq:3-management-alpine) but I’m getting a ton of console messages saying:

2022-06-05T18:59:27.869 app[e8ac3129] ewr [info] Reaped child process with pid: 1034, exit code: 0
2022-06-05T18:59:27.869 app[e8ac3129] ewr [info] Reaped child process with pid: 1036, exit code: 0
2022-06-05T18:59:28.871 app[e8ac3129] ewr [info] Reaped child process with pid: 1055 and signal: SIGUSR1, core dumped? false

Here’s my dockerfile:

FROM rabbitmq:3-management-alpine
COPY ./rabbitmq_delayed_message_exchange-3.10.2.ez /opt/rabbitmq/plugins/
COPY ./prod.conf /etc/rabbitmq/rabbitmq.conf
RUN rabbitmq-plugins enable rabbitmq_delayed_message_exchange

prod.conf

# rabbitmq config file
listeners.tcp.default = 5672
default_user = user
default_pass = xx
log.file.level = warning
log.console.level = warning

and my fly.toml

# fly.toml file generated for example-rabbitmq on 2022-06-05T18:50:00+02:00

app = "example-rabbitmq"


[build]
  dockerfile = "Dockerfile"

# rabbitmq main
[[services]]
  http_checks = []
  internal_port = 5672
  protocol = "tcp"
  script_checks = []

  [[services.tcp_checks]]
    grace_period = "1s"
    interval = "15s"
    restart_limit = 0
    timeout = "2s"


# rabbitmq admin
[[services]]
  http_checks = []
  internal_port = 15672
  protocol = "tcp"
  script_checks = []

  [[services.ports]]
    handlers = ["http", "tls"]
    port = "15672"

  [[services.tcp_checks]]
    grace_period = "1s"
    interval = "15s"
    restart_limit = 0
    timeout = "2s"

Everything seems to be working fine (I can connect and view the admin site), but that error message is freaking me out… should I ignore it?

If everything is working this is fine. Process reaping sounds scary but isn’t really. Basically when a child process exits, as many often due in normal course of operation, it still has a pid and entry in the kernel process table. Those processes that no longer do anything but are still technically there are called zombie processes. Reaping is just what it’s called when those processes get fully removed and cleaned out. Now if the process reaper was reclaiming processes it shouldn’t or child processes are dying prematurely thus breaking things, that would be a problem, but since it is all working it sounds like it’s just going about things normally so I wouldn’t worry.

1 Like

Just wanted to add a line in case others run into issues with using rabbitmq on fly (read: with docker) - since rabbitmq data storage paths are based on hostname and fly will change it per deployment, you should set the MNESIA_DIR as part of an environment variable:

[env]
  RABBITMQ_DEFAULT_USER = "myuser"
  RABBITMQ_MNESIA_DIR = "/var/lib/rabbitmq/mnesia/data"
2 Likes

@jbergstroem @franzwarning Were you folks successful in your RabbitMQ deploy/management? I’m about to roll out a Rabbit service on fly and wondered if you folks ran into any gotchas or what your setup looks like now.

@DAlperin I have followed something quite similar dockerfile:

FROM rabbitmq:3.9-management

COPY prod.conf /etc/rabbitmq/rabbitmq.conf

RUN apt-get -o Acquire::Check-Date=false update && apt-get install -y curl

RUN curl -L https://github.com/rabbitmq/rabbitmq-delayed-message-exchange/releases/download/3.9.0/rabbitmq_delayed_message_exchange-3.9.0.ez > $RABBITMQ_HOME/plugins/rabbitmq_delayed_message_exchange-3.9.0.ez

RUN chown rabbitmq:rabbitmq $RABBITMQ_HOME/plugins/rabbitmq_delayed_message_exchange-3.9.0.ez

RUN rabbitmq-plugins enable rabbitmq_delayed_message_exchange

and my fly.toml:

app = "rabbitmq-app"
kill_signal = "SIGINT"
kill_timeout = 5
processes = []

[env]
  RABBITMQ_MNESIA_DIR = "/var/lib/rabbitmq/mnesia/data"

[experimental]
  allowed_public_ports = []
  auto_rollback = true

[[services]]
  http_checks = []
  internal_port = 5672
  processes = ["app"]
  protocol = "tcp"
  script_checks = []
  
  [[services.tcp_checks]]
    grace_period = "1s"
    interval = "15s"
    restart_limit = 0
    timeout = "2s"

# rabbitmq admin
[[services]]
  http_checks = []
  internal_port = 15672
  protocol = "tcp"
  script_checks = []

  [[services.ports]]
    handlers = ["http", "tls"]
    port = "15672"

  [[services.tcp_checks]]
    grace_period = "1s"

It all seems fine as I deploy the app. I am also able to access the web based rabbitmq admin using app-hostname:5672 signing in with the user credentials I created. However, I can’t seem to connect to the queue via the standard connection uri - amqp://user:pass@host:port/.
I get this error when I try connecting from the my code base:

ERROR/MainProcess] consumer: Cannot connect to amqp://user:**@fly-app-hostname.fly.dev:5672//: timed out.

testing with pika throws :

pika.exceptions.IncompatibleProtocolError: StreamLostError: ("Stream connection lost: ConnectionResetError(10054, 'An existing connection was forcibly closed by the remote host', None, 10054, None)",)

I wonder if I might me missing something else here