I’ve created a fullstack Remix app which has been deployed for a few weeks now at goods.fly.dev. However, in the past few days I haven’t been able to deploy new changes. I push a change and then the GitHub Action fails (I’ve reached out to GitHub as well to ask about this). I don’t know what could have changed to make the deploy not go through.
My fly logs look like this and indicate an issue with healthchecks:
$ fly logs -a goods
Waiting for logs...
2023-01-28T22:20:36.633 app[7f061b8d] ord [info] GET /healthcheck 200 - - 5.955 ms
2023-01-28T22:20:46.642 app[7f061b8d] ord [info] HEAD / 200 - - 3.860 ms
2023-01-28T22:20:46.644 app[7f061b8d] ord [info] GET /healthcheck 200 - - 7.167 ms
2023-01-28T22:21:06.660 app[7f061b8d] ord [info] HEAD / 200 - - 2.968 ms
2023-01-28T22:21:06.662 app[7f061b8d] ord [info] GET /healthcheck 200 - - 6.081 ms
2023-01-28T22:21:16.668 app[7f061b8d] ord [info] HEAD / 200 - - 2.329 ms
2023-01-28T22:21:16.669 app[7f061b8d] ord [info] GET /healthcheck 200 - - 5.577 ms
2023-01-28T22:21:26.676 app[7f061b8d] ord [info] HEAD / 200 - - 2.813 ms
2023-01-28T22:21:26.677 app[7f061b8d] ord [info] GET /healthcheck 200 - - 5.518 ms
2023-01-28T22:21:36.685 app[7f061b8d] ord [info] HEAD / 200 - - 3.011 ms
2023-01-28T22:21:36.686 app[7f061b8d] ord [info] GET /healthcheck 200 - - 6.036 ms
2023-01-28T22:21:46.692 app[7f061b8d] ord [info] HEAD / 200 - - 2.317 ms
I’m not sure it is an issue with healthchecks. Only that log suggests a fast, 200 response code. Which is what you want to see. 200 means all is well.
The question is at what point does it fail. If you click on the Github Action on their page, you can see their log output. It shows a load of detail and should show the steps it takes (npm install, fly deploy, etc). Presumably one of them is failing and as a result, the whole deploy is. Once you see which command is failing, that will indicate how to fix it.
No problem. Ah, that log from Github is interesting. Yep, that shows where it is failing. But not why. Strange!
One tip is you probably don’t want to leave jobs running for that maximum (6 hours) and then timing out, especially if you are paying by the minute for an Action. I’d recommend adding a timeout to your job. Since if it takes e.g 15 minutes to deploy, it’s probably safe to assume it’s gone wrong and won’t complete (deploys should take seconds/minutes, not hours) e.g
For the healthcheck, that looks like it may be called by Fly itself, independent of the deploy. As the timings show a call every 10 seconds, which suggests an automated process. That could be confirmed by looking at your fly.toml file. I’d assume in there you have a healthcheck defined, set to run every 10 seconds. If not … perhaps you have some kind of uptime-bot (Cloudflare, Pingdom or some such). So I don’t think that is related to the deployment timing out.
Thanks for the tip, Greg! I added that line to my deploy config. Also, you’re right in that the fly.toml is configured to run the health check every 10s. Here’s my fly.toml. I tried just removing the healthcheck block and pushing that but there was no change. I don’t have a good enough understanding of this stuff to really diagnose the problem.
Ah, yep, that is the healthcheck and that’s why you get that showing in the logs. That’s fine, you want that to be there to check all is well.
Only thing I can perhaps suggest is seeing if you can add more debugging info to the data in the Github log. Only currently it just says “deploying” … and nothing else. Which isn’t very helpful of it. Maybe change:
run: flyctl deploy --remote-only
run: LOG_LEVEL=debug flyctl deploy --remote-only
(or equivalent in your action). No idea if that syntax is correct (you may need to specify a variable differently) however the idea is to tell the Fly CLI you want more debug data, which by default it won’t show. At worst it will fail but that’s fine as it’s not working anyway. It may show why it’s getting stuck and timing out.
I also destroyed my app and re-deployed with the same result - I was hoping that would work. At this point I think I need the Fly team to look into this issue. I’ve wasted several hours on this already
Hey Fly team, if anyone can help me with this, it would be greatly appreciated. GitHub support reached out and told me they can’t do anything to help me:
I also noticed that the step failing was running on a third-party action https://github.com/superfly/flyctl-actions and we are not the maintainers of the action. I’ll recommend you open an issue in the third-party repository.
I’ve been having issues with an existing Remix app, and after trying everything else, I was finally able to deploy again by using the --local-only flag on fly deploy. If you look at Deploying to Fly via GitHub Action failing - #19 by michael, it appears there’s some issues with deploying Remix apps right now.
Thanks for the reply, emiljt. Unfortunately, I tried that and got the same result. “Deployment is running” but never completes or shows any logs. I’m glad it worked for you though. I’ll check out that thread again and see if any of the fixes discussed there work.
Hi @grahamhagenah, it looks like your app was affected by a bug on our end that was causing a small handful of apps with volumes on two specific hosts (one in ord, one in iad) to get stuck and fail to deploy. It took a few days to track down the root cause of this issue, but we pushed a fix a few hours ago that should have unblocked your app’s deployment. Sorry for the trouble, I appreciate the report and all the debugging effort you put into this!