Deployment hanging at "cleanup" stage

Hi all,

My github actions workflow hangs at the flyctl deploy command. The app is being deployed using a pre-built image. It seems to successfully reach the pre-release database migration task but the command then hangs indefinitely at Starting clean up .

When I eventually manually kill the workflow the logs for this step are as follows:

Run flyctl deploy -c fly/fly.staging.toml -i $DOCKER_REGISTRY/$IMAGE_NAME:616c62d

==> Verifying app config
--> Verified app config
==> Building image
Searching for image '<docker-image (redacted)>' locally...
Searching for image '<docker-image (redacted)>' remotely...
image found: img_3xdk4xx33014go0e
==> Creating release
--> release v153 created
--> You can detach the terminal anytime without stopping the deployment
==> Release command detected: /bin/sh /opt/app/
--> This release will not be available until the release command succeeds.
Error: The operation was canceled.

The app is never deployed, despite the logs indicating it reached the cleanup stage.

Logs from the app itself:

2022-10-28T08:16:58.946 app[d448e2fe] lhr [info] 08:16:58.946 request_id=FyItqbCcVUiZiQgAaiLx [info] Sent 200 in 29ms
2022-10-28T08:30:59.604 runner[cac6d0e4] lhr [info] Starting instance
2022-10-28T08:31:04.652 runner[cac6d0e4] lhr [info] Configuring virtual machine
2022-10-28T08:31:04.659 runner[cac6d0e4] lhr [info] Pulling container image
2022-10-28T08:31:05.482 runner[cac6d0e4] lhr [info] Unpacking image
2022-10-28T08:31:05.493 runner[cac6d0e4] lhr [info] Preparing kernel init
2022-10-28T08:31:06.018 runner[cac6d0e4] lhr [info] Configuring firecracker
2022-10-28T08:31:06.019 runner[cac6d0e4] lhr [info] Starting virtual machine
2022-10-28T08:31:06.317 app[cac6d0e4] lhr [info] Starting init (commit: 249766e)...
2022-10-28T08:31:06.360 app[cac6d0e4] lhr [info] Setting up swapspace version 1, size = 536866816 bytes
2022-10-28T08:31:06.361 app[cac6d0e4] lhr [info] UUID=24650e12-7230-41ba-9e2a-695e051f8f39
2022-10-28T08:31:06.366 app[cac6d0e4] lhr [info] Preparing to run: `/bin/sh /opt/app/` as nobody
2022-10-28T08:31:06.374 app[cac6d0e4] lhr [info] 2022/10/28 08:31:06 listening on [fdaa:0:3610:a7b:bbfa:cac6:d0e4:2]:22 (DNS: [fdaa::3]:53)
2022-10-28T08:31:06.374 app[cac6d0e4] lhr [info] Running DB migration release task 💈
2022-10-28T08:31:08.702 app[cac6d0e4] lhr [info] 08:31:08.698 [info] migrating repos for <my app>
2022-10-28T08:31:09.137 app[cac6d0e4] lhr [info] 08:31:09.136 [info] Migrations already up
2022-10-28T08:31:09.372 app[cac6d0e4] lhr [info] Starting clean up.
2022-10-28T08:39:58.901 app[d448e2fe] lhr [info] 08:39:58.900 request_id=FyIu6v4J3CLEm3UAajXx [info] POST /graphql 

Any ideas what’s happening here and how to fix it?

1 Like

Getting the same. Also deploying from a docker image. Hangs after “Starting clean up”, which is immediately after running my release command successfully. Was deploying fine yesterday. This is a rails app.

In my case it eventually started working again after I tried the deploy a day later. But it was broken for at least 24 hours.

Thanks, I confirm that it also just started working again for me some time later (not sure how long, no more than a day).

UPDATE: well I jinx’ed myself. I happened to be running a deployment when I replied the lines above and it happened again.

I have the same problem, posted a question but got no responses. I’m starting to wonder about the reliability of fly wrt to quality of service and support. Currently, the problem is the deployment just stuck forever.