Error An unknown error occured.During deploy.

Hello, problem updating release. Tried from multiple places. No changes made. Valid for 2 hours.

$ fly deploy --local-only --image ghcr.io/realm/image:v1.3 --app xxxx
Update available 0.0.323 -> v0.0.325.
Run "fly version update" to upgrade.
==> Verifying app config
--> Verified app config
==> Building image
Searching for image 'ghcr.io/realm/image:v1.3' locally...
image found: sha256:835b3ec3055f51cb1d0f6513e75a05ebc2d46941c1f83e940c206a5e880e2ea7
==> Pushing image to fly
The push refers to repository [registry.fly.io/foo]
83d85471d9f8: Layer already exists
a77732cf2fa8: Layer already exists
195ce6778985: Layer already exists
c886798b4ada: Layer already exists
c99d6be99282: Layer already exists
4d58bc4b36dd: Layer already exists
60168087c4f3: Layer already exists
a85ff761bd92: Layer already exists
03c8ccfcaf98: Layer already exists
7d28b2731900: Layer already exists
a53ff049d96e: Layer already exists
3905f740ea0f: Layer already exists
e8d7296ecc6d: Layer already exists
210ae6b24e91: Layer already exists
fe869b35292e: Layer already exists
deployment-1651282278: digest: sha256:036b8ce730f45283f206e79b2de3a7053b2a3e0a4c7a776d0a973954f76cc29e size: 3463
--> Pushing image done
==> Creating release
Error An unknown error occured.

Could you try seeing if the debug output reveals any more about why it’s failing? May not but it’s worth a try:

LOG_LEVEL=debug fly deploy -local-only --image ghcr.io/realm/image:v1.3 --app xxxx

If that does not reveal anythong … does your app have at least one vm running? Most live apps will of course, but if it’s been scaled down for example, it may not. Double check with fly status, I think, and there should be some vms listed in that table.

1 Like

Hey, Greg. So back on the subj. I’ve confirmed via fly status that we have app running in 8 regions.
And here’s LOG_LEVEL debug output (stripped env and sensitive data like dirs and app name). We’re pretty much blocked from releasing.

my-appDEBUG Loaded flyctl config from 
DEBUG determined hostname: "greyborg.local"
DEBUG determined working directory:  
DEBUG determined user home directory: 
DEBUG determined config directory: 
DEBUG ensured config directory exists.
DEBUG ensured config directory perms.
DEBUG cache loaded.
DEBUG config initialized.
DEBUG initialized task manager.
DEBUG started querying for new release
DEBUG client initialized.
DEBUG app config loaded from 
==> Verifying app config
--> Verified app config
==> Building image
DEBUG trying local docker daemon
DEBUG Trying 'Local Image Reference' strategy
Searching for image 'ghcr.io/myorg/app:v1.3' locally...
DEBUG Search terms:[ghcr.io/myorg/app:v1.3 ghcr.io/myorg/app:v1.3:v1.3 myorg/app:v1.3 myorg/app ghcr.io/myorg/app:v1.3 ghcr.io/myorg/app]
image found: sha256:835b3ec3055f51cb1d0f6513e75a05ebc2d46941c1f83e940c206a5e880e2ea7
==> Pushing image to fly
DEBUG querying for release resulted to v0.0.325
The push refers to repository [registry.fly.io/my-app]
83d85471d9f8: Layer already exists
a77732cf2fa8: Layer already exists
195ce6778985: Layer already exists
c886798b4ada: Layer already exists
c99d6be99282: Layer already exists
4d58bc4b36dd: Layer already exists
60168087c4f3: Layer already exists
a85ff761bd92: Layer already exists
03c8ccfcaf98: Layer already exists
7d28b2731900: Layer already exists
a53ff049d96e: Layer already exists
3905f740ea0f: Layer already exists
e8d7296ecc6d: Layer already exists
210ae6b24e91: Layer already exists
fe869b35292e: Layer already exists
deployment-1651542186: digest: sha256:036b8ce730f45283f206e79b2de3a7053b2a3e0a4c7a776d0a973954f76cc29e size: 3463
--> Pushing image done
DEBUG result image:&{ID:sha256:835b3ec3055f51cb1d0f6513e75a05ebc2d46941c1f83e940c206a5e880e2ea7 Tag:registry.fly.io/my-app:deployment-1651542186 Size:385659963} error:<nil>
==> Creating release
DEBUG --> POST https://api.fly.io/graphql {{"query":"mutation($input: DeployImageInput!) { deployImage(input: $input) { release { id version reason description deploymentStrategy user { id email name } evaluationId createdAt } releaseCommand { id command evaluationId } } }","variables":{"input":{"appId":"my-app","image":"registry.fly.io/my-app:deployment-1651542186","services":null,"definition":{"env":{ENVHERE}},"experimental":{"allowed_public_ports":[],"auto_rollback":true,"private_network":true},"kill_signal":"SIGINT","kill_timeout":120,"metrics":{"path":"/metrics","port":9090},"services":[{"concurrency":{"hard_limit":8192,"soft_limit":2048,"type":"connections"},"http_checks":[],"internal_port":8080,"ports":[{"handlers":["http"],"port":80},{"handlers":["tls","http"],"port":443}],"protocol":"tcp","script_checks":[],"tcp_checks":[{"grace_period":"5s","interval":"15s","restart_limit":6,"timeout":"2s"}]}]},"strategy":null}}}
}
DEBUG <-- 500 https://api.fly.io/graphql (669.79ms) {"errors":[{"message":"An unknown error occured.","extensions":{"code":"SERVER_ERROR"}}],"data":{}}
Error An unknown error occured.

@kurt: We haven’t been able to deploy for 3 days now, could we please get some support? :confused:

Fly will be able to help more than me (like if it’s a host/region/API etc issue at their end), but in the meantime this is my go-to list of things that seem to fix various errors. Can’t hurt to try them while you wait:

  1. Run fly version update to get the latest version of the CLI and try another deploy

  2. Run fly doctor to see if it can diagnose any errors

  3. Try restarting the agent: fly agent stop; fly agent start and try another deploy

  4. Enable CLI websockets (instead of the default UDP): fly wireguard websockets enable and try another deploy

  5. Try removing the latest wireguard peer:

fly wireguard list
fly wireguard reset
fly wireguard remove
... remove the one that was initially listed from the list command
  1. If using a remote builder, try destroying that remote builder in case there’s a random fault/disk full etc) with that one:

fly apps list (to get its random name) then fly destroy builder-name-here. And try another deploy

If none of those help, I’m out of ideas. But hopefully one of them may.

3 Likes

Hello, back to subj:

  1. I’m on latest version.
Error no available update
  1. Doctor show no ERR
Testing authentication token... PASSED
Testing flyctl agent... PASSED
Testing local Docker instance... PASSED
Pinging WireGuard gateway (give us a sec)... PASSED

All the other options, including removing builder had no effect (same error).
I’ve established that we without problem deploy staging version (another app).
Please help with some more advice, still unable to release.

A couple more things I can think of:

  1. What’s the vm size? Would the container be able to deploy and run on those?
  2. Does your Dockerfile look alright (…if it has changed from the previous deployed version)?

VM:

VM Size: dedicated-cpu-2x
VM Memory: 4 GB
Count: 8
Max Per Region: Not set

Same image can be released to another app in the fly (staging env).
Also when I try to redeploy same image that is running there atm it fails with the same error.

More details: fly scale memory X works and creates a new release. fly scale count X errors out with unknown error (same as trying to deploy a new image).

So it looks like we finally managed to track this down to the app not having any backup regions. Apparently that’s bad, will block releases and provide no useful errors.

Just in case someone else runs into this.

1 Like

Backup regions shouldn’t cause that. We’ll investigate more, it’s not obvious from the errors what went wrong here.

Did you all try using the [processes] block by chance? Or possibly deploy with no [mounts]?

It looks like what happened is that the region constraints in the job were wrong. These shouldn’t be in there at all when you have volumes mounted.

Setting the backup region “fixed” it indirectly. But we’re not sure what caused the problem in the first place.

1 Like

Not OP, but: Better error messages than “unhealthy allocations” or “unknown error” would help. (:

Debugging blackboxes that are fly deployers is otherwise daunting.

@kurt We’ve introduced mount’s only with the last successful rollout, previously this was deployment without mounts, but some time ago, I’ve added volumes to the deployment, with no mounts declared in the fly.toml. This led to it being spread in the wrong regions, judging by status, while fixing that we decided to remove all volumes and all regions, that we didn’t need including backup regions. That was right after last successful rollout and before errors above happened. During those operations we do not recall any errors from cli.