I’m having a hard time in prod right now since the GH Action fly integration fails quite frequently, making my deploy pipeline pretty flaky and hard to trust.
It always fail with the same reason:
Running bucks-api-prod release_command: sh scripts/release.sh
> Created release_command machine 6e82dd3da76658
> Waiting for 6e82dd3da76658 to have state: started
✖ Failed: error waiting for release_command machine 6e82dd3da76658 to start: timed out waiting for machine to reach started state: failed to wait for VM 6e82dd3da76658 in started state: Get "https://api.machines.dev/v1/apps/bucks-api-prod/machines/6e82dd3da76658/wait?instance_id=01HDA37Q5PH1XY3QSSA81SZ3JZ&state=started&timeout=60": net/http: request canceled
Error: release command failed - aborting deployment. error waiting for release_command machine 6e82dd3da76658 to start: timed out waiting for machine to reach started state: failed to wait for VM 6e82dd3da76658 in started state: Get "https://api.machines.dev/v1/apps/bucks-api-prod/machines/6e82dd3da76658/wait?instance_id=01HDA37Q5PH1XY3QSSA81SZ3JZ&state=started&timeout=60": net/http: request canceled
You can increase the timeout with the --wait-timeout flag
At first, I thought the issue could be related to the timeout as the error suggests. I tried bumping it with:
flyctl deploy --wait-timeout 120 --app bucks-api-prod --remote-only
However, this didn’t fix the problem. Note I can regularly deploy locally using fly deploy --app bucks-api-prod
. From the logs, it seems Fly doesn’t always detect my release machine finished the job. But when inspecting the instance logs, everything looks fine.
Metadata
Fly.toml config
This is how the prod machine is configured:
app = "bucks-api-prod"
primary_region = "gru"
kill_signal = "SIGTERM"
kill_timeout = 5
[deploy]
release_command = "sh scripts/release.sh"
[http_service]
internal_port = 8080
force_https = true
auto_stop_machines = true
auto_start_machines = true
min_machines_running = 2
processes = ["app"]
[[http_service.checks]]
grace_period = "10s"
interval = "60s"
timeout = "5s"
method = "GET"
path = "/health"
[build.args]
PORT = "8080"
[env]
PORT = "8080"
TZ = "utc"
Github Action logs
2023-10-21T21:43:47.1765913Z eflyctl deploy --wait-timeout 120 --app bucks-api-prod --remote-only
2023-10-21T21:43:47.2780960Z ==> Verifying app config
2023-10-21T21:43:47.4173933Z --> Verified app config
2023-10-21T21:43:47.4174589Z Validating /home/runner/work/moni/moni/fly.toml
2023-10-21T21:43:47.4175210Z Platform: machines
2023-10-21T21:43:47.4294355Z e[32m✓e[0m Configuration is valid
2023-10-21T21:43:47.4959752Z ==> Building image
2023-10-21T21:43:48.5268267Z Waiting for remote builder fly-builder-young-frost-9099...
2023-10-21T21:43:54.4651756Z Remote builder fly-builder-young-frost-9099 ready
2023-10-21T21:43:54.8048812Z ==> Building image with Docker
2023-10-21T21:43:54.8049841Z --> docker host: 20.10.12 linux x86_64
2023-10-21T21:43:55.0416204Z #1 [internal] load build definition from Dockerfile
2023-10-21T21:43:55.0416855Z #1 transferring dockerfile: 1.22kB 0.0s done
2023-10-21T21:43:55.0417319Z #1 DONE 0.0s
2023-10-21T21:43:55.0417503Z
2023-10-21T21:43:55.0417656Z #2 [internal] load .dockerignore
2023-10-21T21:43:55.0418084Z #2 transferring context: 92B 0.0s done
2023-10-21T21:43:55.0418501Z #2 DONE 0.0s
2023-10-21T21:43:55.0418671Z
2023-10-21T21:43:55.0419193Z #3 [internal] load metadata for docker.io/library/node:16.19-alpine
2023-10-21T21:43:55.1917666Z #3 DONE 0.2s
2023-10-21T21:43:55.1918011Z
2023-10-21T21:43:55.1919090Z #4 [internal] load metadata for docker.io/library/node:18.16-alpine
2023-10-21T21:43:55.1920182Z #4 DONE 0.1s
2023-10-21T21:43:55.1920608Z
2023-10-21T21:43:55.1921605Z #5 [builder 1/6] FROM docker.io/library/node:18.16-alpine@sha256:d5b2a7869a4016b1847986ea52098fa404421e44281bb7615a9e3615e07f37fb
2023-10-21T21:43:55.1922656Z #5 DONE 0.0s
2023-10-21T21:43:55.1922845Z
2023-10-21T21:43:55.1923661Z #6 [stage-1 1/8] FROM docker.io/library/node:16.19-alpine@sha256:90f5adf9a338b2d88b2d35488e6e5ce9b1fb6be211b57c8c09218791ac4097d4
2023-10-21T21:43:55.1924720Z #6 DONE 0.0s
2023-10-21T21:43:55.1924905Z
2023-10-21T21:43:55.1925106Z #7 [internal] load build context
2023-10-21T21:43:55.4734518Z #7 transferring context: 1.94MB 0.3s done
2023-10-21T21:43:55.4735038Z #7 DONE 0.3s
2023-10-21T21:43:55.4735230Z
2023-10-21T21:43:55.4735463Z #8 [builder 3/6] COPY package.json yarn.lock ./
2023-10-21T21:43:55.4735942Z #8 CACHED
2023-10-21T21:43:55.4736119Z
2023-10-21T21:43:55.4736265Z #9 [builder 2/6] WORKDIR /app
2023-10-21T21:43:55.4736637Z #9 CACHED
2023-10-21T21:43:55.4736796Z
2023-10-21T21:43:55.4737239Z #10 [builder 4/6] RUN yarn install --frozen-lockfile
2023-10-21T21:43:55.4738120Z #10 CACHED
2023-10-21T21:43:55.4738287Z
2023-10-21T21:43:55.4738428Z #11 [builder 5/6] COPY . .
2023-10-21T21:43:55.6241258Z #11 DONE 0.1s
2023-10-21T21:43:55.6241532Z
2023-10-21T21:43:55.6241728Z #12 [builder 6/6] RUN yarn run build:ci
# Omitted for brevity
2023-10-21T21:43:57.0414177Z #12 DONE 1.5s
2023-10-21T21:43:59.2743325Z
2023-10-21T21:43:59.2746692Z #13 [stage-1 2/8] WORKDIR /app
2023-10-21T21:43:59.2747139Z #13 CACHED
2023-10-21T21:43:59.2747316Z
2023-10-21T21:43:59.2747747Z #14 [stage-1 3/8] COPY --from=builder /app/package.json /app/yarn.lock ./
2023-10-21T21:43:59.2748363Z #14 CACHED
2023-10-21T21:43:59.4254314Z
2023-10-21T21:43:59.4255065Z #15 [stage-1 4/8] COPY --from=builder /app/dist ./dist
2023-10-21T21:43:59.4255643Z #15 DONE 0.0s
2023-10-21T21:44:37.7655695Z
2023-10-21T21:44:37.7656122Z #16 exporting to image
2023-10-21T21:44:37.7656551Z #16 exporting layers
2023-10-21T21:44:49.4008917Z e[38;5;252m--> Building image donee[0m
2023-10-21T21:44:49.4009467Z #16 exporting layers 11.6s done
2023-10-21T21:44:49.4054054Z ==> Pushing image to fly
2023-10-21T21:44:49.4054838Z #16 writing image sha256:6d029c318aa76413225d2f69f8ca8df0fcaaaaeba085e16916abd9b85ba790b6 done
2023-10-21T21:44:49.4056149Z #16 naming to registry.fly.io/bucks-api-prod:deployment-01HDA33P3AZVH964ZM643WAJ57 done
2023-10-21T21:44:49.4056893Z #16 DONE 11.6s
2023-10-21T21:44:49.4141287Z The push refers to repository [registry.fly.io/bucks-api-prod]
2023-10-21T21:44:49.4250868Z 82429d0a0c74: Preparing
2023-10-21T21:44:49.4251305Z 5202fc4e015e: Preparing
# Omitted for brevity
2023-10-21T21:44:53.3578748Z ded43fbf30f3: Pushed
2023-10-21T21:44:53.3875338Z 67246b9bbb93: Layer already exists
2023-10-21T21:44:53.4184558Z a1ce792246f9: Layer already exists
2023-10-21T21:44:53.4459850Z af1fa49a98d8: Layer already exists
2023-10-21T21:44:53.4684595Z 7cd52847ad77: Layer already exists
2023-10-21T21:44:53.7165027Z 9ce06da3bb2f: Pushed
2023-10-21T21:44:53.9753787Z ef40145083cf: Pushed
2023-10-21T21:46:03.9830271Z 82429d0a0c74: Pushed
2023-10-21T21:46:05.5486213Z deployment-01HDA33P3AZVH964ZM643WAJ57: digest: sha256:a60f25eec00d5143bfc8017541f02a2103eaeaf17c78a435458a1155450b925e size: 2620
2023-10-21T21:46:05.5491902Z --> Pushing image done
2023-10-21T21:46:05.6303074Z image: registry.fly.io/bucks-api-prod:deployment-01HDA33P3AZVH964ZM643WAJ57
2023-10-21T21:46:05.6303940Z image size: 956 MB
2023-10-21T21:46:05.6304161Z
2023-10-21T21:46:05.6304663Z Watch your deployment at https://fly.io/apps/bucks-api-prod/monitoring
2023-10-21T21:46:05.6305171Z
2023-10-21T21:46:06.4368938Z Running bucks-api-prod release_command: sh scripts/release.sh
2023-10-21T21:46:07.1934017Z > Created release_command machine 6e82dd3da76658
2023-10-21T21:46:07.1934659Z > Waiting for 6e82dd3da76658 to have state: started
2023-10-21T21:48:07.2029149Z ✖ Failed: error waiting for release_command machine 6e82dd3da76658 to start: timed out waiting for machine to reach started state: failed to wait for VM 6e82dd3da76658 in started state: Get "https://api.machines.dev/v1/apps/bucks-api-prod/machines/6e82dd3da76658/wait?instance_id=01HDA37Q5PH1XY3QSSA81SZ3JZ&state=started&timeout=60": net/http: request canceled
2023-10-21T21:48:07.3965573Z Error: release command failed - aborting deployment. error waiting for release_command machine 6e82dd3da76658 to start: timed out waiting for machine to reach started state: failed to wait for VM 6e82dd3da76658 in started state: Get "https://api.machines.dev/v1/apps/bucks-api-prod/machines/6e82dd3da76658/wait?instance_id=01HDA37Q5PH1XY3QSSA81SZ3JZ&state=started&timeout=60": net/http: request canceled
2023-10-21T21:48:07.3970943Z You can increase the timeout with the --wait-timeout flag
Local deployment logs
fly deploy --app bucks-api-prod
==> Verifying app config
Validating /Users/marceloprado/src/moni/fly.toml
Platform: machines
✓ Configuration is valid
--> Verified app config
==> Building image
Remote builder fly-builder-young-frost-9099 ready
==> Building image with Docker
--> docker host: 20.10.12 linux x86_64
[+] Building 109.9s (20/20) FINISHED
=> [internal] load build definition from Dockerfile 1.2s
=> => transferring dockerfile: 1.26kB 1.2s
=> [internal] load .dockerignore 1.2s
=> => transferring context: 92B 1.2s
=> [internal] load metadata for docker.io/library/node:16.19-alpine 0.2s
=> [internal] load metadata for docker.io/library/node:18.16-alpine 0.3s
=> [builder 1/6] FROM docker.io/library/node:18.16-alpine@sha256:d5b2a7869a4016b1847986ea52098fa404421e44281bb7615a9e3615e07f37fb 0.0s
=> [internal] load build context 8.5s
=> => transferring context: 3.36MB 8.5s
=> [stage-1 1/8] FROM docker.io/library/node:16.19-alpine@sha256:90f5adf9a338b2d88b2d35488e6e5ce9b1fb6be211b57c8c09218791ac4097d4 0.0s
=> CACHED [builder 2/6] WORKDIR /app 0.0s
=> [builder 3/6] COPY package.json yarn.lock ./ 0.0s
=> [builder 4/6] RUN yarn install --frozen-lockfile 47.5s
=> [builder 5/6] COPY . . 0.1s
=> [builder 6/6] RUN yarn run build:ci 1.0s
=> CACHED [stage-1 2/8] WORKDIR /app 0.0s
=> CACHED [stage-1 3/8] COPY --from=builder /app/package.json /app/yarn.lock ./ 0.0s
=> [stage-1 4/8] COPY --from=builder /app/dist ./dist 0.0s
=> [stage-1 5/8] COPY --from=builder /app/scripts/release.sh ./scripts/release.sh 0.0s
=> [stage-1 6/8] COPY --from=builder /app/scripts/run-seed.js ./scripts/run-seed.js 0.0s
=> [stage-1 7/8] RUN mv ./dist/prisma ./prisma 0.3s
=> [stage-1 8/8] RUN yarn install --production --frozen-lockfile 37.1s
=> exporting to image 12.2s
=> => exporting layers 12.2s
=> => writing image sha256:6c8539df974aeee0f966855f70b5ac2f763efbb18feff31665e822f702ce5030 0.0s
=> => naming to registry.fly.io/bucks-api-prod:deployment-01HDA532JQS4SJVKQVPVH83WSZ 0.0s
--> Building image done
==> Pushing image to fly
The push refers to repository [registry.fly.io/bucks-api-prod]
fccac35f211d: Pushed
803be405a684: Pushed
2be4db416b36: Pushed
706b6d3b79c9: Pushed
2a41f9c95ed2: Pushed
ef40145083cf: Layer already exists
9ce06da3bb2f: Layer already exists
67246b9bbb93: Layer already exists
a1ce792246f9: Layer already exists
af1fa49a98d8: Layer already exists
7cd52847ad77: Layer already exists
deployment-01HDA532JQS4SJVKQVPVH83WSZ: digest: sha256:3306355c624469b44ec59b889cc729c57cb6fb4ebb1d66ac1bdc7e125acaaee9 size: 2620
--> Pushing image done
image: registry.fly.io/bucks-api-prod:deployment-01HDA532JQS4SJVKQVPVH83WSZ
image size: 957 MB
Watch your deployment at https://fly.io/apps/bucks-api-prod/monitoring
Running bucks-api-prod release_command: sh scripts/release.sh
-------
✔ release_command 3d8d9e1df29718 completed successfully
-------
-------
Updating existing machines in 'bucks-api-prod' with rolling strategy
Finished deploying
-------
✔ [1/2] Machine 148ed314f360d8 [app] update finished: success
✔ [2/2] Machine e2865ed3c77938 [app] update finished: success
-------
Visit your newly deployed app at https://bucks-api-prod.fly.dev/
Release machine logs
The logs below show that my release script correctly finishes. This is the place where I run my migrations and seed the DB:
2023-10-21T21:46:06Z runner[6e82dd3da76658] gru [info]Pulling container image registry.fly.io/bucks-api-prod:deployment-01HDA33P3AZVH964ZM643WAJ57
2023-10-21T21:47:58Z runner[6e82dd3da76658] gru [info]Successfully prepared image registry.fly.io/bucks-api-prod:deployment-01HDA33P3AZVH964ZM643WAJ57 (1m51.081722581s)
2023-10-21T21:48:05Z app[6e82dd3da76658] gru [info][ 0.035436] PCI: Fatal: No config space access function found
2023-10-21T21:48:06Z app[6e82dd3da76658] gru [info] INFO Starting init (commit: 15238e9)...
2023-10-21T21:48:06Z app[6e82dd3da76658] gru [info] INFO Preparing to run: `docker-entrypoint.sh sh scripts/release.sh` as root
2023-10-21T21:48:06Z app[6e82dd3da76658] gru [info] INFO [fly api proxy] listening at /.fly/api
2023-10-21T21:48:06Z app[6e82dd3da76658] gru [info]2023/10/21 21:48:06 listening on [fdaa:2:a5d3:a7b:e3:1ec0:129f:2]:22 (DNS: [fdaa::3]:53)
2023-10-21T21:48:06Z app[6e82dd3da76658] gru [info]🔄 Migrating database...
2023-10-21T21:48:08Z app[6e82dd3da76658] gru [info]Prisma schema loaded from prisma/schema.prisma
2023-10-21T21:48:08Z app[6e82dd3da76658] gru [info]Datasource "db": PostgreSQL database "postgres", schema "public" at "db.illgujtkngnoedumgiuo.supabase.co:5432"
2023-10-21T21:48:09Z app[6e82dd3da76658] gru [info]7 migrations found in prisma/migrations
2023-10-21T21:48:09Z app[6e82dd3da76658] gru [info]Applying migration `20231021125236_add_date_to_transaction_index`
2023-10-21T21:48:09Z app[6e82dd3da76658] gru [info]The following migration have been applied:
2023-10-21T21:48:09Z app[6e82dd3da76658] gru [info]migrations/
2023-10-21T21:48:09Z app[6e82dd3da76658] gru [info] └─ 20231021125236_add_date_to_transaction_index/
2023-10-21T21:48:09Z app[6e82dd3da76658] gru [info] └─ migration.sql
2023-10-21T21:48:09Z app[6e82dd3da76658] gru [info]All migrations have been successfully applied.
2023-10-21T21:48:09Z app[6e82dd3da76658] gru [info]🌱 Seeding database...
2023-10-21T21:48:12Z app[6e82dd3da76658] gru [info]Running seed command `node scripts/run-seed.js` ...
2023-10-21T21:48:12Z app[6e82dd3da76658] gru [info]Running seed script at /app/prisma/seed.js
2023-10-21T21:48:12Z app[6e82dd3da76658] gru [info]Loading .env.prod file...
2023-10-21T21:48:12Z app[6e82dd3da76658] gru [info].env.prod file not found, aborting dotenv loader
2023-10-21T21:48:12Z app[6e82dd3da76658] gru [info]Processing migration file: user_triggers.sql
2023-10-21T21:48:12Z app[6e82dd3da76658] gru [info]Migration completed successfully.
2023-10-21T21:48:12Z app[6e82dd3da76658] gru [info]Processing migration file: supabase_roles.sql
2023-10-21T21:48:12Z app[6e82dd3da76658] gru [info]Migration completed successfully.
2023-10-21T21:48:13Z app[6e82dd3da76658] gru [info]🌱 The seed command has been executed.
2023-10-21T21:48:13Z app[6e82dd3da76658] gru [info]┌─────────────────────────────────────────────────────────┐
2023-10-21T21:48:13Z app[6e82dd3da76658] gru [info]│ Update available 5.4.1 -> 5.4.2 │
2023-10-21T21:48:13Z app[6e82dd3da76658] gru [info]│ Run the following to update │
2023-10-21T21:48:13Z app[6e82dd3da76658] gru [info]│ npm i --save-dev prisma@latest │
2023-10-21T21:48:13Z app[6e82dd3da76658] gru [info]│ npm i @prisma/client@latest │
2023-10-21T21:48:13Z app[6e82dd3da76658] gru [info]└─────────────────────────────────────────────────────────┘
2023-10-21T21:48:14Z app[6e82dd3da76658] gru [info] INFO Main child exited normally with code: 0
2023-10-21T21:48:14Z app[6e82dd3da76658] gru [info] INFO Starting clean up.
2023-10-21T21:48:14Z app[6e82dd3da76658] gru [info] WARN hallpass exited, pid: 315, status: signal: 15 (SIGTERM)
2023-10-21T21:48:14Z app[6e82dd3da76658] gru [info]2023/10/21 21:48:14 listening on [fdaa:2:a5d3:a7b:e3:1ec0:129f:2]:22 (DNS: [fdaa::3]:53)
2023-10-21T21:48:15Z app[6e82dd3da76658] gru [info][ 9.319495] reboot: Restarting system
2023-10-21T21:48:15Z runner[6e82dd3da76658] gru [info]machine restart policy set to 'no', not restarting