So Ive been able to push from my local version of my code to my GitHub repository and use a continues-deployment in order to keep it up to date. Since last night I keep getting this error whenever I try to deploy, either with a PR or manually with fly deploy
Running game-site release_command: /app/bin/migrate
Starting machine
error starting release_command machine: failed to start VM 185e159a223128: aborted: machine destroyed, cannot add any more events
-------
✖ Failed: failed to start VM 185e159a223128: aborted: machine destroyed, cannot add any more events
-------
Error: release command failed - aborting deployment. failed to start VM 185e159a223128: aborted: machine destroyed, cannot add any more events (Request ID: 01JSYR189S4Y3P6JJ8XZBAWDD3-sjc) (Trace ID: 03fad0e974f2d2c5bcf5777510b7130a)
Here are the live logs
2025-04-28T17:51:06.680 runner[4d89664a452578] sjc [info] Successfully prepared image registry.fly.io/game-site:deployment-01JSYRBE1KYZCZ05MARPCN9ZH4 (2.499402566s)
2025-04-28T17:51:13.000 runner[4d89664a452578] sjc [info] Configuring firecracker
2025-04-28T17:51:16.319 app[4d89664a452578] sjc [info] 2025-04-28T17:51:16.319403773 [01JSYRC7XT51KM5R9MF4PSS4FQ:main] Running Firecracker v1.7.0
2025-04-28T17:51:17.213 app[4d89664a452578] sjc [info] INFO Starting init (commit: d15e62a13)...
2025-04-28T17:51:17.362 app[4d89664a452578] sjc [info] INFO Preparing to run: `/app/bin/migrate` as nobody
2025-04-28T17:51:17.367 app[4d89664a452578] sjc [info] ERROR Error: failed to spawn command: /app/bin/migrate: No such file or directory (os error 2)
2025-04-28T17:51:17.368 app[4d89664a452578] sjc [info] does `/app/bin/migrate` exist and is it executable?
2025-04-28T17:51:17.369 app[4d89664a452578] sjc [info] [ 0.981954] reboot: Restarting system
2025-04-28T17:51:17.458 app[4d89664a452578] sjc [warn] Virtual machine exited abruptly
2025-04-28T17:51:17.504 runner[4d89664a452578] sjc [info] machine restart policy set to 'no', not restarting
Happy to try and give more information as needed, but this woill be the 15th time that I have pushed to the Github repo and every other time it worked fine. Any help would be great.
Would it have anything to do with permissions on the files? Ive tried to see if there is any difference between the files from when I first deployed and now and I can’t seem to find any real differences and even if I try to go back to previous commits for those files it still doesn’t work. Thanks for quick feedback though.
It could! That’s what the “is it executable?” in its message was referring to, in fact—the x bit in the permissions. (For the user named nobody, in this instance.)
At times like this, I usually try changing the CMD to sleep inf temporarily and then fly ssh console to poke around in the filesystem and see what is actually there:
$ fly deploy # after changing CMD.
$ fly m start # ensure at least one running.
$ fly ssh console
# ls -l /app/bin/migrate
# od -c /app/bin/migrate # will show offending CRs explicitly as `\r`.
(It can also help to disable auto-stop during this debugging phase, to keep it from shutting down in the middle of your SSH session, .)
0000000 # ! / b i n / s h \n s e t - e
0000020 u \n \n c d - P - - " $ ( d
0000040 i r n a m e - - " $ 0 " ) "
0000060 \n e x e c . / g a m e _ s i t
0000100 e e v a l G a m e S i t e .
0000120 R e l e a s e . m i g r a t e \n
0000140
od -c /app/bin/migrate
0000000 # ! / b i n / s h \n s e t - e
0000020 u \n \n c d - P - - " $ ( d
0000040 i r n a m e - - " $ 0 " ) "
0000060 \n e x e c . / g a m e _ s i t
0000100 e e v a l G a m e S i t e .
0000120 R e l e a s e . m i g r a t e \n
0000140
Thanks… I was just about to ask about the working directory, …
Those two look ok, actually. Can you run /app/bin/migrate as root, from the SSH session?
This is normally the auto_stop_machines setting in fly.toml, although the 7-day free trial has an undocumented 5-minute time limit (from what I hear), and I don’t think that one can be disabled.
Huh… The good news is that you should be able to change your release_command, etc., to incorporate that runuser prefix, and then be USER root in your Dockerfile. (That’s roughly how I have my own Elixir project arranged, due to LiteFS.) It really should work without that, though…
It might help to post your full Dockerfile, even if it’s the one that fly launch auto-generated. That’s where the user-fiddling details would be defined.
Also, for tagging purpose, is this Elixir? (Or maybe Erlang?)
# Find eligible builder and runner images on Docker Hub. We use Ubuntu/Debian
# instead of Alpine to avoid DNS resolution issues in production.
#
# https://hub.docker.com/r/hexpm/elixir/tags?page=1&name=ubuntu
# https://hub.docker.com/_/ubuntu?tab=tags
#
# This file is based on these images:
#
# - https://hub.docker.com/r/hexpm/elixir/tags - for the build image
# - https://hub.docker.com/_/debian?tab=tags&page=1&name=bullseye-20250203-slim - for the release image
# - https://pkgs.org/ - resource for finding needed packages
# - Ex: hexpm/elixir:1.14.5-erlang-26.2.5.8-debian-bullseye-20250203-slim
#
ARG ELIXIR_VERSION=1.14.5
ARG OTP_VERSION=26.2.5.8
ARG DEBIAN_VERSION=bullseye-20250203-slim
ARG BUILDER_IMAGE="hexpm/elixir:${ELIXIR_VERSION}-erlang-${OTP_VERSION}-debian-${DEBIAN_VERSION}"
ARG RUNNER_IMAGE="debian:${DEBIAN_VERSION}"
FROM ${BUILDER_IMAGE} as builder
# install build dependencies
RUN apt-get update -y && apt-get install -y build-essential git \
&& apt-get clean && rm -f /var/lib/apt/lists/*_*
# prepare build dir
WORKDIR /app
# install hex + rebar
RUN mix local.hex --force && \
mix local.rebar --force
# set build ENV
ENV MIX_ENV="prod"
# install mix dependencies
COPY mix.exs mix.lock ./
RUN mix deps.get --only $MIX_ENV
RUN mkdir config
# copy compile-time config files before we compile dependencies
# to ensure any relevant config change will trigger the dependencies
# to be re-compiled.
COPY config/config.exs config/${MIX_ENV}.exs config/
RUN mix deps.compile
COPY priv priv
COPY lib lib
COPY assets assets
# compile assets
RUN mix assets.deploy
# Compile the release
RUN mix compile
# Changes to config/runtime.exs don't require recompiling the code
COPY config/runtime.exs config/
COPY rel rel
RUN mix release
# start a new build stage so that the final image will only contain
# the compiled release and other runtime necessities
FROM ${RUNNER_IMAGE}
RUN apt-get update -y && \
apt-get install -y libstdc++6 openssl libncurses5 locales ca-certificates \
&& apt-get clean && rm -f /var/lib/apt/lists/*_*
# Set the locale
RUN sed -i '/en_US.UTF-8/s/^# //g' /etc/locale.gen && locale-gen
ENV LANG en_US.UTF-8
ENV LANGUAGE en_US:en
ENV LC_ALL en_US.UTF-8
WORKDIR "/app"
RUN chown nobody /app
# set runner ENV
ENV MIX_ENV="prod"
# Only copy the final release from the build stage
COPY --from=builder --chown=nobody:root /app/_build/${MIX_ENV}/rel/game_site ./
# #Added this line
# RUN chmod 755 /app/bin/*
# #Added this line
# RUN [ -f /app/bin/migrate ] && chmod +x /app/bin/migrate
USER nobody
# If using an environment that doesn't automatically reap zombie processes, it is
# advised to add an init process such as tini via `apt-get install`
# above and adding an entrypoint. See https://github.com/krallin/tini for details
# ENTRYPOINT ["/tini", "--"]
CMD ["/app/bin/server"]
fly.toml
# fly.toml app configuration file generated for game-site on 2025-04-11T18:56:22-07:00
#
# See https://fly.io/docs/reference/configuration/ for information about how to use this file.
#
app = 'game-site'
primary_region = 'sjc'
kill_signal = 'SIGTERM'
[build]
[deploy]
release_command = '/app/bin/migrate'
[env]
PHX_HOST = 'game-site.fly.dev'
PORT = '8080'
[http_service]
internal_port = 8080
force_https = true
auto_stop_machines = 'off'
auto_start_machines = true
min_machines_running = 0
processes = ['app']
[http_service.concurrency]
type = 'connections'
hard_limit = 1000
soft_limit = 1000
[[vm]]
memory = '1gb'
cpu_kind = 'shared'
cpus = 1
2025-04-28T19:47:57.129 runner[2865100f702798] sjc [info] Pulling container image registry.fly.io/game-site:deployment-01JSYZ1DJ5AVFBMWYEY1B6B4RD
2025-04-28T19:48:01.830 runner[2865100f702798] sjc [info] Successfully prepared image registry.fly.io/game-site:deployment-01JSYZ1DJ5AVFBMWYEY1B6B4RD (4.700915395s)
2025-04-28T19:48:03.526 runner[2865100f702798] sjc [info] Configuring firecracker
2025-04-28T19:48:05.982 app[2865100f702798] sjc [info] 2025-04-28T19:48:05.982223047 [01JSYZ28G3RK0XW2C0C7T3BWWX:main] Running Firecracker v1.7.0
2025-04-28T19:48:07.043 app[2865100f702798] sjc [info] INFO Starting init (commit: d15e62a13)...
2025-04-28T19:48:07.184 app[2865100f702798] sjc [info] INFO Preparing to run: `runuser -u nobody -- /app/bin/migrate` as root
2025-04-28T19:48:07.187 app[2865100f702798] sjc [info] INFO [fly api proxy] listening at /.fly/api
2025-04-28T19:48:07.233 runner[2865100f702798] sjc [info] Machine started in 1.352s
2025-04-28T19:48:07.264 app[2865100f702798] sjc [info] runuser: failed to execute /app/bin/migrate: No such file or directory
2025-04-28T19:48:07.501 app[2865100f702798] sjc [info] 2025/04/28 19:48:07 INFO SSH listening listen_address=[fdaa:12:95db:a7b:181:7e5c:e38d:2]:22
2025-04-28T19:48:08.193 app[2865100f702798] sjc [info] INFO Main child exited normally with code: 1
2025-04-28T19:48:08.211 app[2865100f702798] sjc [info] INFO Starting clean up.
2025-04-28T19:48:09.781 app[2865100f702798] sjc [info] WARN could not unmount /rootfs: EINVAL: Invalid argument
2025-04-28T19:48:09.782 app[2865100f702798] sjc [info] [ 3.708968] reboot: Restarting system
2025-04-28T19:48:10.691 runner[2865100f702798] sjc [info] machine restart policy set to 'no', not restarting
and
image: registry.fly.io/game-site:deployment-01JSYZ1DJ5AVFBMWYEY1B6B4RD
image size: 49 MB
Watch your deployment at https://fly.io/apps/game-site/monitoring
Running game-site release_command: runuser -u nobody -- /app/bin/migrate
Starting machine
-------
✖ release_command failed
-------
Error release_command failed running on machine 2865100f702798 with exit code 1.
Checking logs: fetching the last 100 lines below:
2025-04-28T19:48:05Z 2025-04-28T19:48:05.982223047 [01JSYZ28G3RK0XW2C0C7T3BWWX:main] Running Firecracker v1.7.0
2025-04-28T19:48:07Z INFO Starting init (commit: d15e62a13)...
2025-04-28T19:48:07Z INFO Preparing to run: `runuser -u nobody -- /app/bin/migrate` as root
2025-04-28T19:48:07Z INFO [fly api proxy] listening at /.fly/api
2025-04-28T19:48:07Z Machine started in 1.352s
2025-04-28T19:48:07Z runuser: failed to execute /app/bin/migrate: No such file or directory
2025-04-28T19:48:07Z 2025/04/28 19:48:07 INFO SSH listening listen_address=[fdaa:12:95db:a7b:181:7e5c:e38d:2]:22
2025-04-28T19:48:08Z INFO Main child exited normally with code: 1
-------
Error: release command failed - aborting deployment. machine 2865100f702798 exited with non-zero status of 1
Did I miss a line that I should have changed on an other file?
So I removed the line and the “rest” of it went through but now the site is down…
--> Build Summary: ()
--> Building image done
image: registry.fly.io/game-site:deployment-01JSYZEGMPPQJF9XR77ZB3C1H3
image size: 49 MB
Watch your deployment at https://fly.io/apps/game-site/monitoring
-------
Updating existing machines in 'game-site' with rolling strategy
-------
✔ [1/2] Cleared lease for 1781997b5e2e08
✔ [2/2] Cleared lease for 080e45dc109068
-------
Checking DNS configuration for game-site.fly.dev
Visit your newly deployed app at https://game-site.fly.dev/
Where as before at least the site was still up.
All the machines are down and are running into issues trying to start back up now too.
Yikes, … You should be able to revert† to an older build via fly deploy --image registry.fly.io/<long-string>. (You can get the list of recent ones via fly releases --image.)
Aside: I haven’t been able to reproduce any of these release_command anomalies on my own Machines, despite trying several variations, so I don’t think this is a global glitch…
†Later edit: It appears that you’re back up and running now, but I should insert a caveat for general reference that any incompatible changes to fly.toml, secrets, etc., since the time of that older image have to be manually undone. fly deploy --image only affects the Dockerfile part; there unfortunately isn’t a single command that rolls back all aspects (of an app).
However, if your site is down, but your machines are stable, then at least you have a clear debugging path; you can shell into your container using flyctl, and then see if your listener has died, or run networking tools to see what IP address your listener has attached to.
Thanks for the help I would have let you know yesterday but I reached my max limit on replies for my first day.
It was in fact that at some point my migrate and a few other files within the rel folder got formatted with the windows ^M
-#!/bin/sh
-
-# configure node for distributed erlang with IPV6 support
-export ERL_AFLAGS="-proto_dist inet6_tcp"
-export ECTO_IPV6="true"
-export DNS_CLUSTER_QUERY="${FLY_APP_NAME}.internal"
-export RELEASE_DISTRIBUTION="name"
-export RELEASE_NODE="${FLY_APP_NAME}-${FLY_IMAGE_REF##*-}@${FLY_PRIVATE_IP}"
-
-# Uncomment to send crash dumps to stderr
-# This can be useful for debugging, but may log sensitive information
-# export ERL_CRASH_DUMP=/dev/stderr
-# export ERL_CRASH_DUMP_BYTES=4096
+#!/bin/sh^M
+^M
+# configure node for distributed erlang with IPV6 support^M
+export ERL_AFLAGS="-proto_dist inet6_tcp"^M
+export ECTO_IPV6="true"^M
+export DNS_CLUSTER_QUERY="${FLY_APP_NAME}.internal"^M
+export RELEASE_DISTRIBUTION="name"^M
+export RELEASE_NODE="${FLY_APP_NAME}-${FLY_IMAGE_REF##*-}@${FLY_PRIVATE_IP}"^M
+^M
+# Uncomment to send crash dumps to stderr^M
+# This can be useful for debugging, but may log sensitive information^M
+# export ERL_CRASH_DUMP=/dev/stderr^M
+# export ERL_CRASH_DUMP_BYTES=4096^M
I went through each commit till I found a working one and checked the diff from my current working (local) repo.