We’re getting “An unknown error occurred.” message during deployment, after the image has been pushed to Fly. It was working earlier today but all of sudden we’re getting this error. Any issue we should know about? Thanks.
Seeing the same with a Dockerfile based deployment:
--> Pushing image done
Image: registry.fly.io/xxx:deployment-xxx
Image size: xxx MB
==> Creating release
Error An unknown error occured.
Our current IPv4 pool has been exhausted. Adding some more now.
This should now be resolved.
Thanks!
Hi, this issue is happening again.
@amithm7 On it, thanks for reporting.
@amithm7 Not seeing errors right now, might have just been a transient error or something specific to the build. Could you try a LOG_LEVEL=debug flyctl deploy
so we can see what’s happening?
Last few lines after pushing image:
==> Creating release
DEBUG --> POST https://api.fly.io/graphql {{"query":"mutation($input: DeployImageInput!) { deployImage(input: $input) { release { id version reason description deploymentStrategy user { id email name } createdAt } releaseCommand { id command } } }","variables":{"input":{"appId":"<retracted>","image":"registry.fly.io/<retracted>","services":null,"definition":{"env":{"DENO_ENV":"production"},"experimental":{"allowed_public_ports":[],"auto_rollback":true},"kill_signal":"SIGINT","kill_timeout":5,"processes":{"doh":"run --allow-net --allow-env --allow-read http.ts"},"services":[{"concurrency":{"hard_limit":25,"soft_limit":20,"type":"connections"},"http_checks":[],"internal_port":8080,"ports":[{"handlers":["tls","http"],"port":443}],"processes":["doh"],"protocol":"tcp","script_checks":[],"tcp_checks":[{"grace_period":"1s","interval":"15s","restart_limit":6,"timeout":"2s"}]}]},"strategy":null}}}
}
DEBUG <-- 500 https://api.fly.io/graphql (1.42s) {"errors":[{"message":"An unknown error occured.","extensions":{"code":"SERVER_ERROR"}}],"data":{}}
Error An unknown error occured.
Thanks, @amithm7 I see an error (unrelated to IP allocations, though). The number of instances currently running seems to be inconsistent in some way, can you help with
flyctl scale show
to see how many VMs you have going now, and flyctl autoscale show
to see what the autoscaling settings are?
❯ flyctl scale show
VM Resources for <app-id>
VM Size: shared-cpu-1x
VM Memory: 256 MB
Count: 2
Max Per Region: Not set
❯ flyctl autoscale show
Scale Mode: Balanced
Min Count: 2
Max Count: 2
Is the max count preventing deployment?
Probably. As part of the deployment we’ll probably need to add more VMs and remove then remove the old ones. Can you try setting a min=0 max=10 or something, doing the deployment, and then going back to 2/2?
Yes, setting flyctl autoscale balanced min=0 max=3
worked. Thanks!
Awesome. I’ll see if we can improve that error message and/or add a check for this.
Is setting min
to 0
the only way out (should it be even set to 0
)? Or, is it enough if max
and min
are not the same value? And if max
should be twice as min
for smooth deployment across regions?
I don’t actually think min needs to be 0, that was more a debugging step.
As a general rule of thumb setting max
to twice the number of VMs you expect / want to have seems reasonable during a deployment. If you have 3 VMs running and you deploy, for instance, 3 new ones need to be started and the 3 old ones stopped, so there could theoretically be 6 running at some point.
In practice I think the scheduler will work to make the deploy happen even if it’s more restricted than that, and can do rolling deploys, but I don’t think the rules and edge cases are frozen or documented at this point.
Makes sense. Might warrant a gotcha in the flyctl deploy doc Flyctl and a note in the scaling doc Scaling and Autoscaling
3 posts were split to a new topic: Apps dying unexpectedly