Previously successfully deploy hangs at "Monitoring Deployment"

A deploy that previously worked is now hanging at “Monitoring Deployment”.

These are the recent changes to environment:

  1. I was prompted to upgrade to a new flyctl, which I have done. OS is Windows 11.
    a. Possibly related: During a deploy, there was a warning about agent version differences. It indicated it was shutting down the old version and starting the new one. I did also reboot, just in case.

  2. I wanted to have a prod and a qa instance. So, following the example from Monorepo and Multi-Environment Deployments · Fly Docs, I created a new app. When I try to deploy that, by referencing its toml file, it also hangs. When I diff the files, the only difference is the app name and when the file was generated.

When I run flyctl ls apps it shows:

  NAME                         | STATUS  | ORG      | DEPLOYED
-------------------------------*---------*----------*-----------------
  basecamp-trial               | running | personal | 10 minutes ago
  basecamp-trial-qa            | error   | personal | 6 minutes ago
  fly-builder-spring-lake-5681 | pending | personal |

When deploying the original (basecamp-trial) it eventually times out with this error (and a few previous lines for context):

--> You can detach the terminal anytime without stopping the deployment
==> Monitoring deployment
Error 1 error occurred:
        * No deployment available to monitor

If relevant, I noticed that when I went to deploy basecamp-trial-qa using its toml file, it seemed to use some layers from basecamp-trial.

For basecamp-trial, it seems to have deployed, but is unavailable to monitor.

For the -qa edition, its deployment errored. Here are its logs:

2022-07-27T20:17:22Z runner[6c68f47c] ord [info]Starting instance
2022-07-27T20:17:23Z runner[6c68f47c] ord [info]Configuring virtual machine
2022-07-27T20:17:23Z runner[6c68f47c] ord [info]Pulling container image
2022-07-27T20:17:23Z runner[6c68f47c] ord [info]Unpacking image
2022-07-27T20:17:23Z runner[6c68f47c] ord [info]Preparing kernel init
2022-07-27T20:17:23Z runner[6c68f47c] ord [info]Configuring firecracker
2022-07-27T20:17:23Z runner[6c68f47c] ord [info]Starting virtual machine
2022-07-27T20:17:23Z app[6c68f47c] ord [info]Starting init (commit: c86b3dc)...
2022-07-27T20:17:23Z app[6c68f47c] ord [info]Preparing to run: `gunicorn --bind :8080 --workers 2 web:app` as root
2022-07-27T20:17:23Z app[6c68f47c] ord [info]2022/07/27 20:17:23 listening on [fdaa:0:7a11:a7b:9adb:6c68:f47c:2]:22 (DNS: [fdaa::3]:53)
2022-07-27T20:17:24Z app[6c68f47c] ord [info][2022-07-27 20:17:24 +0000] [515] [INFO] Starting gunicorn 20.1.0
2022-07-27T20:17:24Z app[6c68f47c] ord [info][2022-07-27 20:17:24 +0000] [515] [INFO] Listening at: http://0.0.0.0:8080 (515)
2022-07-27T20:17:24Z app[6c68f47c] ord [info][2022-07-27 20:17:24 +0000] [515] [INFO] Using worker: sync
2022-07-27T20:17:24Z app[6c68f47c] ord [info][2022-07-27 20:17:24 +0000] [522] [INFO] Booting worker with pid: 522
2022-07-27T20:17:24Z app[6c68f47c] ord [info][2022-07-27 20:17:24 +0000] [523] [INFO] Booting worker with pid: 523
2022-07-27T20:18:02Z runner[775745be] ord [info]Shutting down virtual machine
2022-07-27T20:18:02Z app[775745be] ord [info]Sending signal SIGINT to main child process w/ PID 515
2022-07-27T20:18:02Z app[775745be] ord [info][2022-07-27 20:18:02 +0000] [515] [INFO] Handling signal: int
2022-07-27T20:18:02Z app[775745be] ord [info][2022-07-27 20:18:02 +0000] [523] [INFO] Worker exiting (pid: 523)
2022-07-27T20:18:02Z app[775745be] ord [info][2022-07-27 20:18:02 +0000] [522] [INFO] Worker exiting (pid: 522)
2022-07-27T20:18:03Z app[775745be] ord [info][2022-07-27 20:18:03 +0000] [515] [INFO] Shutting down: Master
2022-07-27T20:18:03Z app[775745be] ord [info]Main child exited normally with code: 0
2022-07-27T20:18:03Z app[775745be] ord [info]Starting clean up.

Possible causes that I can think of, in descending order of probability:

  1. I did something wrong creating the second app that blew everything up.

  2. There is something about the new flyctl that is causing the issue.

  3. There is a platform error.

Please halp :slight_smile:

you may have already seen this, but based on the timing of these errors, sounds like it could be related to an incident that we’re looking into. We’ll continue to report on this via our status page, which you can subscribe to here:

1 Like

Thanks, @eli. Confirmed that it works, now that the platform issue is resolved.

1 Like