I am unable to understand what might be wrong with my elixir deploy.
These 2 issues (first, second) seemed similar but they unfortunately didn’t help. My app does listen on port 4000 (same as the
fly.toml config) and I do not believe it takes long for it to start listening on it.
It succeeded once but then proceeded to fail right away and apparently restart over and over.
Here are the logs I get upon running
fly deploy (I was getting the same thing when the app was continuously restarting and also when running
fly vm status <id>):
Preparing kernel init Configuring firecracker Starting virtual machine Starting init (commit: 50ffe20)... Preparing to run: `/app/entrypoint.sh /app/bin/my_app eval MyApp.Release.migrate` as root 2021/10/07 23:47:35 listening on [fdaa:0:357c:a7b:2203:4241:e3ff:2]:22 (DNS: [fdaa::3]:53) Reaped child process with pid: 563 and signal: SIGUSR1, core dumped? false 23:47:39.497 [info] Migrations already up Reaped child process with pid: 565 and signal: SIGUSR1, core dumped? false Reaped child process with pid: 612 and signal: SIGUSR1, core dumped? false 23:47:41.677 [info] Migrations already up Main child exited normally with code: 0 Reaped child process with pid: 614 and signal: SIGUSR1, core dumped? false Starting clean up. ...
The final log is
[error] Health check status changed 'warning' => 'critical' ***v8 failed - Failed due to unhealthy allocations - not rolling back to stable job version 8 as current job has same specification and deploying as v9
Here is my fly.toml:
app = "my-app" kill_signal = "SIGTERM" kill_timeout = 5 processes =  [deploy] release_command = "/app/bin/my_app eval MyApp.Release.migrate" [env] [experimental] allowed_public_ports =  auto_rollback = true private_network=true [[services]] http_checks =  internal_port = 4000 processes = ["app"] protocol = "tcp" script_checks =  [services.concurrency] hard_limit = 25 soft_limit = 20 type = "connections" [[services.ports]] handlers = ["http"] port = 80 [[services.ports]] handlers = ["tls", "http"] port = 443 [[services.tcp_checks]] grace_period = "30s" interval = "15s" restart_limit = 6 timeout = "2s" port = "4000"
My app contains big files in the priv folder (total is about 100MB), which I load in a GenServer’s
handle_continue (meaning it first starts listening on port 4000 and then loads the data); if that helps.