During Release "Error server returned a non-200 status code: 504"

Hello, I have a Fly app that has been seeing consistent issues during deployment. The first couple days there were no issues deploying the app, but the app over the past few days will not complete the deployment process:

...
image size: 1.8 GB
==> Creating release
Error server returned a non-200 status code: 504

When I previously had this happen, I had to delete the app entirely from Fly and re-create it. However, this 504 issue has come back with the new app.

That means our API is timing out. Possibly because 1.8GB is pretty large for a Docker image.

If you run fly status, you may see the new release proceeding in the background.

Also, if you run LOG_LEVEL=debug fly deploy, you’ll be able to see the exact API call that’s timing out.

Thanks-- the size is really a lot of node_modules junk, so I am working on removing it. But, this is happening even with a nearly empty node:18-alpine image:

Dockerfile

FROM node:18-alpine

ENV NEXT_TELEMETRY_DISABLED=1

ENV NODE_ENV=production

Output with debug enabled

DEBUG <-- 200 https://api.fly.io/graphql (172.71ms)

{
  "data": {
    "finishBuild": {
      "id": "760642",
      "status": "completed",
      "wallclockTimeMs": 10434
    }
  }
}
image: registry.fly.io/my-app:deployment-XXX
image size: 173 MB
==> Creating release
DEBUG --> POST https://api.fly.io/graphql

{
  "query": "mutation($input: DeployImageInput!) { deployImage(input: $input) { release { id version reason description deploymentStrategy user { id email name } evaluationId createdAt } releaseCommand { id command evaluationId } } }",
  "variables": {
    "input": {
      "appId": "aviation-app",
      "image": "registry.fly.io/my-app:deployment-XXX",
      "services": null,
      "definition": {
        "deploy": {
          "strategy": "bluegreen"
        },
        "processes": {
          "myprocess": "yarn start"
        }
      },
      "strategy": null
    }
  }
}

DEBUG {}
DEBUG <-- 504 https://api.fly.io/graphql (1m0.17s)

DEBUG <html><body><h1>504 Gateway Time-out</h1>
The server didn't respond in time.
</body></html>

Error server returned a non-200 status code: 504

fly status

fly status
App
  Name     = [redacted]-app
  Owner    = [redacted]
  Version  = 17
  Status   = running
  Hostname = [redacted]-app.fly.dev
  Platform = nomad

Instances
ID      	PROCESS         	VERSION	REGION	DESIRED	STATUS 	HEALTH CHECKS	RESTARTS	CREATED
xxx	myprocess	17     	iad   	run    	running	             	0       	2023-01-19T04:07:08Z	

The only status here is from the existing, running application and does not show any activity from deploys attempted in the last 2 days.

For what it’s worth, I was able to reduce the image size to < 500mb with my app and it is still resulting in 504s. But, if I flyctl launch into a new application, it is able to deploy immediately with no problems.