Interrupt app startup

I setup my app so that before running it waits on the database to be available. For this I used a script that tries to connect to it, and if it fails, it tries again indefinitely.

In a dumb mistake, I didn’t set the database credentials and host correctly and now the deployments are waiting on the database to be available (which will never happen).

Is there a way to cancel the startup of those versions? The app is now working fine, but the failed attempts of previous deployments are flooding the logs.

Not sure exactly why the logging spam wouldn’t go away after you deploy your app with this issue fixed…

Not that I recommend it, but you can always flyctl ssh console -s ... into your app instances (ref), and sudo things to your heart’s content.

You can also scp things (see), fwiw.

See if those come in handy?


Just a note, you could use Fly’s built-in vault to vend secrets, btw.

1 Like

Thanks for the help!

For some reason I can’t reach the VMs. Running flyctl ssh console -s only lists the working instance of the app. The running but non working VMs show in the fly-metrics.net dashboard, but there is no way to interact with them.

I tried deleting the app, and created it again, and somehow the logs still show on the “Monitoring” tab.

Yikes. Zombie VMs? This is bad for all sorts of reasons!

Not sure what’s going on here but worth escalating it to Fly engs. Not sure who to tag, but I guess @JP_Phillips might be able to help (if free).

1 Like

Yes, please let us know if you’re still running into this with one of your apps (and if so, which app); we’re always happy to dig into any unusual behavior you find!

It’s still happening, the app is called ale-recipes. Thank you!

Sure thing-- would you mind sharing the alloc / vm IDs for the deleting instances that are still sending logs? That will help us pinpoint the issue. A snippet of the recent logs could also come in handy here, too.

The IDs are 6fa685e0, 6ec14c65, 08505984, a8294f5a.

This is the part that repeats in the logs:

2022-09-13T18:36:19.438 app[6fa685e0] nrt [info] Waiting for PostgreSQL to become available...
2022-09-13T18:36:19.522 app[6ec14c65] ord [info] Waiting for PostgreSQL to become available...
2022-09-13T18:36:19.770 app[08505984] dfw [info] Waiting for PostgreSQL to become available...
2022-09-13T18:36:20.563 app[a8294f5a] nrt [info] Waiting for PostgreSQL to become available...

Thank you! We’ll take a look, and keep you updated as this mystery unravels.

1 Like

So like you thought, these jobs from your release_command were hanging around (as they were waiting for a confirmation that would never come). They were still emitting logs, so that’s why you were seeing them in fly-metrics.

Is there a way to cancel the startup of those versions? The app is now working fine, but the failed attempts of previous deployments are flooding the logs.

There isn’t a way to cancel release_command jobs.That said, we’ve just now merged a PR to fix this-- once it rolls out, release jobs won’t hang around indefinitely anymore.

We manually killed your old release jobs, and you should see them go away in the logs within an hour :slightly_smiling_face:

2 Likes

Great! Many thanks. :grimacing:

1 Like