Fly deploy fails waiting for health checks suddenly - no configuration changes since last deploy

Thanks @stephentgrammer ! Do you mind sharing your app name? If you prefer to keep this information confidential you can email to support@fly.io .Thanks!

rb-backend2-qa

Hi @jamal , done some investigation and were able to replicate your issue. Could you please try having either [http_service] or [[services]] blocks but not both? This seems to be an edge case when both blocks are merged and the resulting services conflict.

Hi @stephentgrammer, could you please try one more time? I’ve seen that there are many releases that succeeded for your app and recently they started to fail, I’m trying to figure out if the initial problem left your app in a corrupt state preventing it from starting.

@aschiavo Yes sir! That makes sense. I just retried and the deploy failed at 8:04am PST.

@stephentgrammer From logs, it looks like the app is dying while booting trying to establish a connection. Would you please check your app logs at ā€œMar 15, 2024 @ 15:15:29.919000000ā€ . If you want to follow up in private to provide more information about that connectivity issue, please email support@fly.io . Thanks!

I don’t have [http_service] in my config, only [[services]]. My project is based on the Epic Stack so you can see what my file looks like at epic-stack/fly.toml at main Ā· epicweb-dev/epic-stack Ā· GitHub. I’ve only made minor tweaks to this file in my project.

Oh sorry, I may be checking the wrong app, I’m referring to whisker-ocr-pr-30.
Or maybe this is a really odd edge case. Do you mind confirming that the config you see in the dashboard matches what you have in your app’s repo?
Thanks! Your help to troubleshoot this issue is very much appreciate.

Oh interesting. I do see the [http_service] in the dashboard config but it is not in my source code. Let me investigate my Github workflows to determine how that’s getting injected in the config.

Thanks @aschiavo! I was able to confirm that our problem was completed unrelated to this issue. Our db cluster was somehow corrupted (SSH between machines was impossible, and that or something else prevented our app from connecting to the db). After forking the db to a brand new fly app, everything is working as normal! Thanks again for the help!

@aschiavo So it looks like running flyctl launch is the culprit. It’s modify the fly.toml file to add the [http_service] section. I guess the CLI command is what has changed recently.

This is the command being run:

flyctl launch --no-deploy --copy-config --name "whisker-ocr-pr-30" --region "dfw" --org "one-click-rescue"

Is there a flag to tell flyctl launch not to modify the fly.toml? I skimmed through the --help docs but didn’t see anything that stood out.

Here’s a diff of the fly.toml before and after running flyctl launch. I ran the command on my machine to produce this.

https://www.diffchecker.com/fACoM9aY/

Great news!! Thanks for you help to troubleshoot.

Hi @jamal ! What do you think about switching from launch to apps create? I’m not sure it’s a possibility to tell launch to not modify the configuration file, as the framework scanner may overwrite parts. Please refer the last sentence in this docs page .

1 Like

Yup, that did the trick. Now it creates a new app without modifying my fly.toml file, and my app deploys successfully on the first deployment. :raised_hands:

I am using the superfly/fly-pr-review-apps Github action to deploy my PRs. This is what was using fly launch to create the new apps. In order to switch this to use fly apps create, I needed to copy this action into my repo with that one line changed. I’m not sure why they chose to use fly launch in the GH action, so I opened an issue in the repo.

I am satisfied with this workaround. Thanks so much for helping me troubleshoot and find a solution for this issue.

This is great news! I’m glad to have helped to resolve the issue.
I’ll follow up with the GH action issue, thanks for reporting.