release_command is run 2–3 times then the deploy fails, "No deployment available to monitor"

Hello, I might have the same problem as this comment: starting a new thread as the OP had a different issue.

When I include a release_command in fly.toml, the command seems to be run twice (three times when in GH actions). After exiting successfully, the deployment hangs at the “Monitoring deployment” stage, and is eventually killed.

Here’s the output:

--> Building image done
==> Pushing image to fly
The push refers to repository [registry.fly.io/myapp-pr-24-backend]
5acc1c21964d: Pushed
34b8be05b72f: Pushed
a1673408f5c4: Pushed
51ae9ee994e3: Pushed
04e84c1644a6: Pushed
0af5a454b76f: Pushed
974d10b0374d: Layer already exists
aa94f2805e8b: Layer already exists
556c468853f4: Layer already exists
2c9f341968bc: Layer already exists
ad6562704f37: Layer already exists
deployment-1654610280: digest: sha256:494d23c3da200544a73fc5552abed0f654351bd38bd76c84e3e22ff9a7f23f04 size: 2635
--> Pushing image done
image: registry.fly.io/myapp-pr-24-backend:deployment-1654610280
image size: 846 MB
==> Creating release
--> release v1 created

--> You can detach the terminal anytime without stopping the deployment
==> Release command detected: alembic upgrade head

--> This release will not be available until the release command succeeds.
         Starting instance
         Configuring virtual machine
         Pulling container image
         Unpacking image
         Preparing kernel init
         Configuring firecracker
         Starting init (commit: e3eb6d2)...
         Setting up swapspace version 1, size = 512 MiB (536866816 bytes)
         no label, UUID=480d498c-4979-41e1-9acd-d6e5d85cb37f
         Preparing to run: `alembic upgrade head` as root
         2022/06/07 14:07:02 listening on [fdaa:0:68f3:a7b:a160:a19d:5cab:2]:22 (DNS: [fdaa::3]:53)
         2022-06-07 14:07.02.930936 [info     ] Logging initialized
         INFO  [alembic.runtime.migration] Context impl PostgresqlImpl.
         INFO  [alembic.runtime.migration] Will assume transactional DDL.
         Starting clean up.
         Starting instance
         Configuring virtual machine
         Pulling container image
         Unpacking image
         Preparing kernel init
         Configuring firecracker
         Starting virtual machine
         Starting init (commit: e3eb6d2)...
         Setting up swapspace version 1, size = 512 MiB (536866816 bytes)
         no label, UUID=480d498c-4979-41e1-9acd-d6e5d85cb37f
         Preparing to run: `alembic upgrade head` as root
         2022/06/07 14:07:02 listening on [fdaa:0:68f3:a7b:a160:a19d:5cab:2]:22 (DNS: [fdaa::3]:53)
         2022-06-07 14:07.02.930936 [info     ] Logging initialized
         INFO  [alembic.runtime.migration] Context impl PostgresqlImpl.
         INFO  [alembic.runtime.migration] Will assume transactional DDL.
         Main child exited normally with code: 0
         Starting clean up.
==> Monitoring deployment
Error 1 error occurred:
        * No deployment available to monitor

Here’s my fly.toml:

# fly.toml file generated for myapp on 2022-05-25T21:12:53+02:00

app = "myapp"

kill_signal = "SIGINT"
kill_timeout = 5
processes = []

[env]
  APP_ENV = "production"

[experimental]
  allowed_public_ports = []
  auto_rollback = true

[[services]]
  http_checks = []
  internal_port = 8000
  processes = ["app"]
  protocol = "tcp"
  script_checks = []

  [services.concurrency]
    hard_limit = 25
    soft_limit = 20
    type = "connections"

  [[services.ports]]
    force_https = true
    handlers = ["http"]
    port = 80

  [[services.ports]]
    handlers = ["tls", "http"]
    port = 443

  [[services.tcp_checks]]
    grace_period = "1s"
    interval = "15s"
    restart_limit = 0
    timeout = "2s"

[deploy]
  release_command = "alembic upgrade head"

Here’s the Dockerfile, just in case that’s useful:

FROM python:3.10.4-slim

ARG APP_ENV

ENV APP_ENV=${APP_ENV} \
  PYTHONFAULTHANDLER=1 \
  PYTHONUNBUFFERED=1 \
  PYTHONHASHSEED=random \
  PIP_NO_CACHE_DIR=off \
  PIP_DISABLE_PIP_VERSION_CHECK=on \
  PIP_DEFAULT_TIMEOUT=100 \
  POETRY_VERSION=1.0.0

# In order to be able to install Postgres client libraries
RUN apt-get update && apt-get install -y libpq-dev gcc git

RUN pip install --no-cache-dir --upgrade pip && \
  pip install --no-cache-dir "poetry==$POETRY_VERSION"

# Only copy Poetry files to get more cache hits at this layer
WORKDIR /code
COPY poetry.lock pyproject.toml /code/
RUN poetry config virtualenvs.create false && \
  poetry install $(test "$APP_ENV" == production && echo "--no-dev") --no-interaction --no-ansi

COPY . /code

CMD [ "./scripts/docker/start-backend.sh"]

Hey there, I know it’s been a few days, did you get this resolved?
It does look like your deploy is working, you should be able to run fly status to see your new VMs coming up.
This is something that happens when flyctl waits to detect a Nomad deployment, sometimes it just doesn’t work.
We are working on replacing our current architecture and these problems will go away sometime this year.

Hey @zee, do you mean that the release_command is run successfully and then the app is deployed immediately afterwards? I.e. that the “No deployment available to monitor” error doesn’t actually affect the code going live?

We have been forced to remove the release_command from our fly.toml and run database migrations manually :frowning: – that might be why you see that the app exists and is running.

Is there any other workaround while you get the underlying issue fixed?

“no deployment to monitor” means that the CLI can’t find a new deployment. If the release command succeeds then the deployment continues to the server, the CLI just doesn’t know about it.
To check on this you can run the deployment with the release command then run fly status to check that the VMs were updated.

Furthermore, you can use fly deploy --strategy immediate which replaces all VMs instantly, but won’t create a “deploy” record to monitor. So you can replicate the behavior by doing an immediate deploy.

Hey @zee ! Also having this issue. What should I do in a CI/CD context for bluegreen deploys? The deployment appears healthy, so I want the build to pass, but the command fails eventually, so the build fails.