App stuck on `pending`

Yaeger · December 22, 2022, 4:21pm

We just did a redeploy and our app (only web process) is on pending for the last 5 minutes. Is everything going okay? We only deleted a secret, so I find it hard to believe that we broke it.

ebc3741c web 156 fra run pending 0 4m29s ago

App is called staxcloud-prod.

Yaeger · December 22, 2022, 4:24pm

“Failed due to unhealthy allocations - not rolling back to stable job version 156 as current job has same specification”

The VM is still on pending though and did not revert to a previous version.

Yaeger · December 22, 2022, 4:26pm

We pushed a dummy change and it worked. Still confused as to why it went down

kurt · December 22, 2022, 4:30pm

It looks like you did a rolling deploy, but your app only had one VM running. Rolling deploys take down existing VMs, then bring up new ones. Sometimes new VMs take a while to schedule so this can cause downtime.

You should run fly scale count 2 at a minimum for apps you care about. fly deploy --strategy canary is also a more reliable deployment process, provided you’re not using volumes or --max-per-region.

Yaeger · December 22, 2022, 5:27pm

Thanks for that explanation @kurt

I’m really confused about deploys today I am running a deploy now and I see:

--> This release will not be available until the release command succeeds.
         Starting instance
Running release task (pending)... 🌏

Ok that doesn’t that look that weird, but my new VMs are already started. They only start after running the release task, right?

In the logs I see the migrations (my release command) have already ran successfully, yet it’s still stuck at “Running release task (pending)…”.

Yaeger · December 22, 2022, 5:28pm

Even 5 mins later the web and scheduler are stuck on pending and it says it’s still stuck on the release command

Yaeger · December 22, 2022, 5:29pm

Wow interesting. Now I am getting new VMs and the other “new” VMs are getting shut down:

Are these “old” deploys that are only now getting through the queue?

Yaeger · December 22, 2022, 5:32pm

Now fly deploy aborted with:

Yaeger · December 22, 2022, 5:35pm

Some VMs (of the latest version) are now in pending but scheduler is on running

2 mins later desired switched to stop and they are getting removed again…

Idk if I am doing something wrong or if something weird is going on at Fly but I have do know that I have no idea how to fix this

Yaeger · December 22, 2022, 5:37pm

So even with those VMs that have desired=stop, I am now seeing:

2022-12-22T17:36:42Z runner[bed6314c] fra [info]Configuring virtual machine
2022-12-22T17:36:42Z runner[bed6314c] fra [info]Pulling container image
2022-12-22T17:36:53Z runner[bed6314c] fra [info]Unpacking image
2022-12-22T17:36:55Z runner[b168964b] fra [info]Configuring virtual machine
2022-12-22T17:36:55Z runner[b168964b] fra [info]Pulling container image
2022-12-22T17:36:55Z runner[b168964b] fra [info]Unpacking image

That sounds like it’s preparing for a new VM

Yaeger · December 22, 2022, 5:40pm

I did make some changes to my Dockerfile, fly.toml etcetera so I don’t rule out I broke something but it’s kinda hard to debug if that’s the case, and the behavior that I’m seeing in the logs & fly status is hard toe explain for me

Yaeger · December 22, 2022, 6:08pm

It behaves just as weirdly on the main branch, so I find it hard to believe that I broke it with my changes

(I’m working on staxcloud-staging btw, would be great if someone could have a look!)

pulleasy · December 22, 2022, 9:10pm

I’m also hit with that error. Deployed a few times today and the last few deploys are getting stuck on Running release task (pending)... 🌏 and then error out with the above message. No idea why.

tvdfly · December 22, 2022, 10:05pm

Hi folks, we had a host in our fra region that was overloaded and taking several minutes to sometimes much longer to start new instances. We’ve stopped scheduling new allocs on that host and will make sure it’s healthy before allowing new allocs back on it. It looks like the folks on this thread were likely impacted by that issue.

If you continue to see this issue, one mitigation is to try deploying to another region at least temporarily.

Yaeger · December 23, 2022, 9:56am

Thanks!

Topic		Replies	Views
App stuck in pending state	9	3014	April 10, 2022
Deploys stuck in `pending` Questions / Help	4	488	February 3, 2023
Application VMs down without any change, can't deploy Phoenix	16	1308	October 3, 2022
New apps started up today (Jan 21) stuck in pending Questions / Help	0	391	January 21, 2022
Deploying my app is stuck at "pending"	2	647	September 2, 2022

App stuck on `pending`

Related topics