I can’t move past this point; I’m not seeing anything in the logs after Preparing kernel init so I’m struggling to debug what the issue could be.
Locally I’m able to run the migration script without error, and the Docker container builds and starts as expected. There have been no changed to the Dockerfile since the last successful release. Using fly deploy --verbose doesn’t add any useful information to the build process.
I don’t think so. The container size itself is only 420MB, and the instance is set to 1024MB.
Interestingly, I’m now unable to see any metrics in Fly’s dashboard for the instance/app. Nothing for the last 2 days. But the app is up and running, and in the logs it’s doing all the background processing I expect.
I think we might be running the release commands in small VMs. We should run them in the same size as the app’s VM. I’ll check with the team.
Well I just looked and our certificate to communicate with our metrics cluster has expired. Whoops! Going to add a continuous test for this. Of course, we’ll renew it. Metrics should be back momentarily!
It appears your release command isn’t logging anything, and then exiting with a non zero exit code. Your best bet is to comment out the release command, then deploy, then flyctl ssh console and run it by hand to debug.
It appears your release command isn’t logging anything, and then exiting with a non zero exit code.
That’s odd. I would expect, if there’s a non-zero exit, that at least some form of error output would occur. It’s an Elixir app which executes a migration as the release command.
Your best bet is to comment out the release command, then deploy, then flyctl ssh console and run it by hand to debug.
I’ve copied down the DB structure locally and run the migrations against it (standard process for testing migrations before deploy) and everything’s been working fine. But I’ll see if I can do as you suggest…
$ fly deploy --verbose
Deploying XXXXX
...
Monitoring Deployment
1 desired, 1 placed, 0 healthy, 1 unhealthy
v30 failed - Failed due to unhealthy allocations - rolling back to job version 29
1 desired, 1 placed, 1 healthy, 0 unhealthy
--> v31 deployed successfully
***v30 failed - Failed due to unhealthy allocations - rolling back to job version 29 and deploying as v31
Again, I’m not seeing anything in the logs so it’s pretty much impossible to debug:
I commented out the release command in fly.toml, but am now seeing a general deployment error. Nothing useful showing up in logs about this one.
2 desired, 2 placed, 2 healthy, 0 unhealthy
--> v120 deployed successfully
***v119 failed - Failed due to unhealthy allocations - rolling back to job version 118 and deploying as v120
Oh, interesting. The fly deploy --strategy immediate --remote-only attempt created a completely new app fly-builder-restless-surf-2468 in my account. I’ve not seen this happen before. It is also “dead”, along with my downed app.
I’m really annoyed I have no recovery options here. Any one of the following options would mitigate this downtime:
A toggle for “maintenance mode” to display a static HTML file while the toggle is active. I could switch that on and provide a notice to our users.
A “failover” option when an app is “dead” or non-responsive, to either show a static HTML page or proxy to another site. Again, I could provide a notice to our users.
A manual rollback option. Something like flyctl deploy --rollback v28, or being able to click a Rollback to here button in the “Activity” dashboard screen next to a “green” build. Being able to revert to a previously running state without a deploy would really help when deploys aren’t working as expected.
So, the site is down hard, the option of “deploying something to fix” isn’t available, I have no other tools or channels available to me, and I’m starting to get emails about it. My only option is to repoint the DNS somewhere and hope people don’t see the error for too long.
Further update: I’ve been able to deploy a completely new app, under a new account, which is based on the same Dockerfile and elixir/phoenix “root” in use across all the apps I’m hosting on Fly.io