My deploys are flaky when using GH Action

I’m having a hard time in prod right now since the GH Action fly integration fails quite frequently, making my deploy pipeline pretty flaky and hard to trust.

It always fail with the same reason:

Running bucks-api-prod release_command: sh scripts/release.sh
> Created release_command machine 6e82dd3da76658
> Waiting for 6e82dd3da76658 to have state: started
✖ Failed: error waiting for release_command machine 6e82dd3da76658 to start: timed out waiting for machine to reach started state: failed to wait for VM 6e82dd3da76658 in started state: Get "https://api.machines.dev/v1/apps/bucks-api-prod/machines/6e82dd3da76658/wait?instance_id=01HDA37Q5PH1XY3QSSA81SZ3JZ&state=started&timeout=60": net/http: request canceled
Error: release command failed - aborting deployment. error waiting for release_command machine 6e82dd3da76658 to start: timed out waiting for machine to reach started state: failed to wait for VM 6e82dd3da76658 in started state: Get "https://api.machines.dev/v1/apps/bucks-api-prod/machines/6e82dd3da76658/wait?instance_id=01HDA37Q5PH1XY3QSSA81SZ3JZ&state=started&timeout=60": net/http: request canceled
You can increase the timeout with the --wait-timeout flag

At first, I thought the issue could be related to the timeout as the error suggests. I tried bumping it with:

flyctl deploy --wait-timeout 120 --app bucks-api-prod --remote-only

However, this didn’t fix the problem. Note I can regularly deploy locally using fly deploy --app bucks-api-prod. From the logs, it seems Fly doesn’t always detect my release machine finished the job. But when inspecting the instance logs, everything looks fine.

Metadata

Fly.toml config

This is how the prod machine is configured:

app = "bucks-api-prod"
primary_region = "gru"
kill_signal = "SIGTERM"
kill_timeout = 5

[deploy]
  release_command = "sh scripts/release.sh"

[http_service]
  internal_port = 8080
  force_https = true
  auto_stop_machines = true
  auto_start_machines = true
  min_machines_running = 2
  processes = ["app"]

[[http_service.checks]]
  grace_period = "10s"
  interval = "60s"
  timeout = "5s"
  method = "GET"
  path = "/health"

[build.args]
  PORT = "8080"

[env]
  PORT = "8080"
  TZ = "utc"

Github Action logs

2023-10-21T21:43:47.1765913Z eflyctl deploy --wait-timeout 120 --app bucks-api-prod --remote-only
2023-10-21T21:43:47.2780960Z ==> Verifying app config
2023-10-21T21:43:47.4173933Z --> Verified app config
2023-10-21T21:43:47.4174589Z Validating /home/runner/work/moni/moni/fly.toml
2023-10-21T21:43:47.4175210Z Platform: machines
2023-10-21T21:43:47.4294355Z e[32m✓e[0m Configuration is valid
2023-10-21T21:43:47.4959752Z ==> Building image
2023-10-21T21:43:48.5268267Z Waiting for remote builder fly-builder-young-frost-9099...
2023-10-21T21:43:54.4651756Z Remote builder fly-builder-young-frost-9099 ready
2023-10-21T21:43:54.8048812Z ==> Building image with Docker
2023-10-21T21:43:54.8049841Z --> docker host: 20.10.12 linux x86_64
2023-10-21T21:43:55.0416204Z #1 [internal] load build definition from Dockerfile
2023-10-21T21:43:55.0416855Z #1 transferring dockerfile: 1.22kB 0.0s done
2023-10-21T21:43:55.0417319Z #1 DONE 0.0s
2023-10-21T21:43:55.0417503Z 
2023-10-21T21:43:55.0417656Z #2 [internal] load .dockerignore
2023-10-21T21:43:55.0418084Z #2 transferring context: 92B 0.0s done
2023-10-21T21:43:55.0418501Z #2 DONE 0.0s
2023-10-21T21:43:55.0418671Z 
2023-10-21T21:43:55.0419193Z #3 [internal] load metadata for docker.io/library/node:16.19-alpine
2023-10-21T21:43:55.1917666Z #3 DONE 0.2s
2023-10-21T21:43:55.1918011Z 
2023-10-21T21:43:55.1919090Z #4 [internal] load metadata for docker.io/library/node:18.16-alpine
2023-10-21T21:43:55.1920182Z #4 DONE 0.1s
2023-10-21T21:43:55.1920608Z 
2023-10-21T21:43:55.1921605Z #5 [builder 1/6] FROM docker.io/library/node:18.16-alpine@sha256:d5b2a7869a4016b1847986ea52098fa404421e44281bb7615a9e3615e07f37fb
2023-10-21T21:43:55.1922656Z #5 DONE 0.0s
2023-10-21T21:43:55.1922845Z 
2023-10-21T21:43:55.1923661Z #6 [stage-1 1/8] FROM docker.io/library/node:16.19-alpine@sha256:90f5adf9a338b2d88b2d35488e6e5ce9b1fb6be211b57c8c09218791ac4097d4
2023-10-21T21:43:55.1924720Z #6 DONE 0.0s
2023-10-21T21:43:55.1924905Z 
2023-10-21T21:43:55.1925106Z #7 [internal] load build context
2023-10-21T21:43:55.4734518Z #7 transferring context: 1.94MB 0.3s done
2023-10-21T21:43:55.4735038Z #7 DONE 0.3s
2023-10-21T21:43:55.4735230Z 
2023-10-21T21:43:55.4735463Z #8 [builder 3/6] COPY package.json yarn.lock ./
2023-10-21T21:43:55.4735942Z #8 CACHED
2023-10-21T21:43:55.4736119Z 
2023-10-21T21:43:55.4736265Z #9 [builder 2/6] WORKDIR /app
2023-10-21T21:43:55.4736637Z #9 CACHED
2023-10-21T21:43:55.4736796Z 
2023-10-21T21:43:55.4737239Z #10 [builder 4/6] RUN yarn install --frozen-lockfile
2023-10-21T21:43:55.4738120Z #10 CACHED
2023-10-21T21:43:55.4738287Z 
2023-10-21T21:43:55.4738428Z #11 [builder 5/6] COPY . .
2023-10-21T21:43:55.6241258Z #11 DONE 0.1s
2023-10-21T21:43:55.6241532Z 
2023-10-21T21:43:55.6241728Z #12 [builder 6/6] RUN yarn run build:ci
# Omitted for brevity
2023-10-21T21:43:57.0414177Z #12 DONE 1.5s
2023-10-21T21:43:59.2743325Z 
2023-10-21T21:43:59.2746692Z #13 [stage-1 2/8] WORKDIR /app
2023-10-21T21:43:59.2747139Z #13 CACHED
2023-10-21T21:43:59.2747316Z 
2023-10-21T21:43:59.2747747Z #14 [stage-1 3/8] COPY --from=builder /app/package.json /app/yarn.lock ./
2023-10-21T21:43:59.2748363Z #14 CACHED
2023-10-21T21:43:59.4254314Z 
2023-10-21T21:43:59.4255065Z #15 [stage-1 4/8] COPY --from=builder /app/dist ./dist
2023-10-21T21:43:59.4255643Z #15 DONE 0.0s
2023-10-21T21:44:37.7655695Z 
2023-10-21T21:44:37.7656122Z #16 exporting to image
2023-10-21T21:44:37.7656551Z #16 exporting layers
2023-10-21T21:44:49.4008917Z e[38;5;252m--> Building image donee[0m
2023-10-21T21:44:49.4009467Z #16 exporting layers 11.6s done
2023-10-21T21:44:49.4054054Z ==> Pushing image to fly
2023-10-21T21:44:49.4054838Z #16 writing image sha256:6d029c318aa76413225d2f69f8ca8df0fcaaaaeba085e16916abd9b85ba790b6 done
2023-10-21T21:44:49.4056149Z #16 naming to registry.fly.io/bucks-api-prod:deployment-01HDA33P3AZVH964ZM643WAJ57 done
2023-10-21T21:44:49.4056893Z #16 DONE 11.6s
2023-10-21T21:44:49.4141287Z The push refers to repository [registry.fly.io/bucks-api-prod]
2023-10-21T21:44:49.4250868Z 82429d0a0c74: Preparing
2023-10-21T21:44:49.4251305Z 5202fc4e015e: Preparing
# Omitted for brevity
2023-10-21T21:44:53.3578748Z ded43fbf30f3: Pushed
2023-10-21T21:44:53.3875338Z 67246b9bbb93: Layer already exists
2023-10-21T21:44:53.4184558Z a1ce792246f9: Layer already exists
2023-10-21T21:44:53.4459850Z af1fa49a98d8: Layer already exists
2023-10-21T21:44:53.4684595Z 7cd52847ad77: Layer already exists
2023-10-21T21:44:53.7165027Z 9ce06da3bb2f: Pushed
2023-10-21T21:44:53.9753787Z ef40145083cf: Pushed
2023-10-21T21:46:03.9830271Z 82429d0a0c74: Pushed
2023-10-21T21:46:05.5486213Z deployment-01HDA33P3AZVH964ZM643WAJ57: digest: sha256:a60f25eec00d5143bfc8017541f02a2103eaeaf17c78a435458a1155450b925e size: 2620
2023-10-21T21:46:05.5491902Z --> Pushing image done
2023-10-21T21:46:05.6303074Z image: registry.fly.io/bucks-api-prod:deployment-01HDA33P3AZVH964ZM643WAJ57
2023-10-21T21:46:05.6303940Z image size: 956 MB
2023-10-21T21:46:05.6304161Z 
2023-10-21T21:46:05.6304663Z Watch your deployment at https://fly.io/apps/bucks-api-prod/monitoring
2023-10-21T21:46:05.6305171Z 
2023-10-21T21:46:06.4368938Z Running bucks-api-prod release_command: sh scripts/release.sh
2023-10-21T21:46:07.1934017Z > Created release_command machine 6e82dd3da76658
2023-10-21T21:46:07.1934659Z > Waiting for 6e82dd3da76658 to have state: started
2023-10-21T21:48:07.2029149Z ✖ Failed: error waiting for release_command machine 6e82dd3da76658 to start: timed out waiting for machine to reach started state: failed to wait for VM 6e82dd3da76658 in started state: Get "https://api.machines.dev/v1/apps/bucks-api-prod/machines/6e82dd3da76658/wait?instance_id=01HDA37Q5PH1XY3QSSA81SZ3JZ&state=started&timeout=60": net/http: request canceled
2023-10-21T21:48:07.3965573Z Error: release command failed - aborting deployment. error waiting for release_command machine 6e82dd3da76658 to start: timed out waiting for machine to reach started state: failed to wait for VM 6e82dd3da76658 in started state: Get "https://api.machines.dev/v1/apps/bucks-api-prod/machines/6e82dd3da76658/wait?instance_id=01HDA37Q5PH1XY3QSSA81SZ3JZ&state=started&timeout=60": net/http: request canceled
2023-10-21T21:48:07.3970943Z You can increase the timeout with the --wait-timeout flag

Local deployment logs

 fly deploy --app bucks-api-prod                
==> Verifying app config
Validating /Users/marceloprado/src/moni/fly.toml
Platform: machines
✓ Configuration is valid
--> Verified app config
==> Building image
Remote builder fly-builder-young-frost-9099 ready
==> Building image with Docker
--> docker host: 20.10.12 linux x86_64
[+] Building 109.9s (20/20) FINISHED                                                                                                                                                                                               
 => [internal] load build definition from Dockerfile                                                                                                                                                                          1.2s
 => => transferring dockerfile: 1.26kB                                                                                                                                                                                        1.2s
 => [internal] load .dockerignore                                                                                                                                                                                             1.2s
 => => transferring context: 92B                                                                                                                                                                                              1.2s
 => [internal] load metadata for docker.io/library/node:16.19-alpine                                                                                                                                                          0.2s
 => [internal] load metadata for docker.io/library/node:18.16-alpine                                                                                                                                                          0.3s
 => [builder 1/6] FROM docker.io/library/node:18.16-alpine@sha256:d5b2a7869a4016b1847986ea52098fa404421e44281bb7615a9e3615e07f37fb                                                                                            0.0s
 => [internal] load build context                                                                                                                                                                                             8.5s
 => => transferring context: 3.36MB                                                                                                                                                                                           8.5s
 => [stage-1 1/8] FROM docker.io/library/node:16.19-alpine@sha256:90f5adf9a338b2d88b2d35488e6e5ce9b1fb6be211b57c8c09218791ac4097d4                                                                                            0.0s
 => CACHED [builder 2/6] WORKDIR /app                                                                                                                                                                                         0.0s
 => [builder 3/6] COPY package.json yarn.lock ./                                                                                                                                                                              0.0s
 => [builder 4/6] RUN yarn install --frozen-lockfile                                                                                                                                                                         47.5s
 => [builder 5/6] COPY . .                                                                                                                                                                                                    0.1s
 => [builder 6/6] RUN yarn run build:ci                                                                                                                                                                                       1.0s 
 => CACHED [stage-1 2/8] WORKDIR /app                                                                                                                                                                                         0.0s 
 => CACHED [stage-1 3/8] COPY --from=builder /app/package.json /app/yarn.lock ./                                                                                                                                              0.0s 
 => [stage-1 4/8] COPY --from=builder /app/dist ./dist                                                                                                                                                                        0.0s 
 => [stage-1 5/8] COPY --from=builder /app/scripts/release.sh ./scripts/release.sh                                                                                                                                            0.0s 
 => [stage-1 6/8] COPY --from=builder /app/scripts/run-seed.js ./scripts/run-seed.js                                                                                                                                          0.0s 
 => [stage-1 7/8] RUN mv ./dist/prisma ./prisma                                                                                                                                                                               0.3s 
 => [stage-1 8/8] RUN yarn install --production --frozen-lockfile                                                                                                                                                            37.1s
 => exporting to image                                                                                                                                                                                                       12.2s 
 => => exporting layers                                                                                                                                                                                                      12.2s 
 => => writing image sha256:6c8539df974aeee0f966855f70b5ac2f763efbb18feff31665e822f702ce5030                                                                                                                                  0.0s 
 => => naming to registry.fly.io/bucks-api-prod:deployment-01HDA532JQS4SJVKQVPVH83WSZ                                                                                                                                         0.0s 
--> Building image done                                                                                                                                                                                                            
==> Pushing image to fly                                                                                                                                                                                                           
The push refers to repository [registry.fly.io/bucks-api-prod]
fccac35f211d: Pushed 
803be405a684: Pushed 
2be4db416b36: Pushed 
706b6d3b79c9: Pushed 
2a41f9c95ed2: Pushed 
ef40145083cf: Layer already exists 
9ce06da3bb2f: Layer already exists 
67246b9bbb93: Layer already exists 
a1ce792246f9: Layer already exists 
af1fa49a98d8: Layer already exists 
7cd52847ad77: Layer already exists 
deployment-01HDA532JQS4SJVKQVPVH83WSZ: digest: sha256:3306355c624469b44ec59b889cc729c57cb6fb4ebb1d66ac1bdc7e125acaaee9 size: 2620
--> Pushing image done
image: registry.fly.io/bucks-api-prod:deployment-01HDA532JQS4SJVKQVPVH83WSZ
image size: 957 MB

Watch your deployment at https://fly.io/apps/bucks-api-prod/monitoring

Running bucks-api-prod release_command: sh scripts/release.sh

-------
 ✔ release_command 3d8d9e1df29718 completed successfully
-------
-------
Updating existing machines in 'bucks-api-prod' with rolling strategy
  Finished deploying
-------
 ✔ [1/2] Machine 148ed314f360d8 [app] update finished: success
 ✔ [2/2] Machine e2865ed3c77938 [app] update finished: success
-------

Visit your newly deployed app at https://bucks-api-prod.fly.dev/

Release machine logs

The logs below show that my release script correctly finishes. This is the place where I run my migrations and seed the DB:

2023-10-21T21:46:06Z runner[6e82dd3da76658] gru [info]Pulling container image registry.fly.io/bucks-api-prod:deployment-01HDA33P3AZVH964ZM643WAJ57
2023-10-21T21:47:58Z runner[6e82dd3da76658] gru [info]Successfully prepared image registry.fly.io/bucks-api-prod:deployment-01HDA33P3AZVH964ZM643WAJ57 (1m51.081722581s)
2023-10-21T21:48:05Z app[6e82dd3da76658] gru [info][    0.035436] PCI: Fatal: No config space access function found
2023-10-21T21:48:06Z app[6e82dd3da76658] gru [info] INFO Starting init (commit: 15238e9)...
2023-10-21T21:48:06Z app[6e82dd3da76658] gru [info] INFO Preparing to run: `docker-entrypoint.sh sh scripts/release.sh` as root
2023-10-21T21:48:06Z app[6e82dd3da76658] gru [info] INFO [fly api proxy] listening at /.fly/api
2023-10-21T21:48:06Z app[6e82dd3da76658] gru [info]2023/10/21 21:48:06 listening on [fdaa:2:a5d3:a7b:e3:1ec0:129f:2]:22 (DNS: [fdaa::3]:53)
2023-10-21T21:48:06Z app[6e82dd3da76658] gru [info]🔄 Migrating database...
2023-10-21T21:48:08Z app[6e82dd3da76658] gru [info]Prisma schema loaded from prisma/schema.prisma
2023-10-21T21:48:08Z app[6e82dd3da76658] gru [info]Datasource "db": PostgreSQL database "postgres", schema "public" at "db.illgujtkngnoedumgiuo.supabase.co:5432"
2023-10-21T21:48:09Z app[6e82dd3da76658] gru [info]7 migrations found in prisma/migrations
2023-10-21T21:48:09Z app[6e82dd3da76658] gru [info]Applying migration `20231021125236_add_date_to_transaction_index`
2023-10-21T21:48:09Z app[6e82dd3da76658] gru [info]The following migration have been applied:
2023-10-21T21:48:09Z app[6e82dd3da76658] gru [info]migrations/
2023-10-21T21:48:09Z app[6e82dd3da76658] gru [info]  └─ 20231021125236_add_date_to_transaction_index/
2023-10-21T21:48:09Z app[6e82dd3da76658] gru [info]    └─ migration.sql
2023-10-21T21:48:09Z app[6e82dd3da76658] gru [info]All migrations have been successfully applied.
2023-10-21T21:48:09Z app[6e82dd3da76658] gru [info]🌱 Seeding database...
2023-10-21T21:48:12Z app[6e82dd3da76658] gru [info]Running seed command `node scripts/run-seed.js` ...
2023-10-21T21:48:12Z app[6e82dd3da76658] gru [info]Running seed script at /app/prisma/seed.js
2023-10-21T21:48:12Z app[6e82dd3da76658] gru [info]Loading .env.prod file...
2023-10-21T21:48:12Z app[6e82dd3da76658] gru [info].env.prod file not found, aborting dotenv loader
2023-10-21T21:48:12Z app[6e82dd3da76658] gru [info]Processing migration file: user_triggers.sql
2023-10-21T21:48:12Z app[6e82dd3da76658] gru [info]Migration completed successfully.
2023-10-21T21:48:12Z app[6e82dd3da76658] gru [info]Processing migration file: supabase_roles.sql
2023-10-21T21:48:12Z app[6e82dd3da76658] gru [info]Migration completed successfully.
2023-10-21T21:48:13Z app[6e82dd3da76658] gru [info]🌱  The seed command has been executed.
2023-10-21T21:48:13Z app[6e82dd3da76658] gru [info]┌─────────────────────────────────────────────────────────┐
2023-10-21T21:48:13Z app[6e82dd3da76658] gru [info]│  Update available 5.4.1 -> 5.4.2                        │
2023-10-21T21:48:13Z app[6e82dd3da76658] gru [info]│  Run the following to update                            │
2023-10-21T21:48:13Z app[6e82dd3da76658] gru [info]│    npm i --save-dev prisma@latest                       │
2023-10-21T21:48:13Z app[6e82dd3da76658] gru [info]│    npm i @prisma/client@latest                          │
2023-10-21T21:48:13Z app[6e82dd3da76658] gru [info]└─────────────────────────────────────────────────────────┘
2023-10-21T21:48:14Z app[6e82dd3da76658] gru [info] INFO Main child exited normally with code: 0
2023-10-21T21:48:14Z app[6e82dd3da76658] gru [info] INFO Starting clean up.
2023-10-21T21:48:14Z app[6e82dd3da76658] gru [info] WARN hallpass exited, pid: 315, status: signal: 15 (SIGTERM)
2023-10-21T21:48:14Z app[6e82dd3da76658] gru [info]2023/10/21 21:48:14 listening on [fdaa:2:a5d3:a7b:e3:1ec0:129f:2]:22 (DNS: [fdaa::3]:53)
2023-10-21T21:48:15Z app[6e82dd3da76658] gru [info][    9.319495] reboot: Restarting system
2023-10-21T21:48:15Z runner[6e82dd3da76658] gru [info]machine restart policy set to 'no', not restarting

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.