App not deploying -- Deployment stuck in "Pending", no VMs being allocated

Our postgres app is stuck at “pending” – running multiple deployments to update some secrets and env vars, the changes did not apply – so I forced the deployment by stopping the stale VM. This then led to there being no VMs, and no new VMs were created to replace it.

We now have a postgres cluster with no VMs, so none of our apps are working – while this is currently a staging environment this is clearly unacceptable. An app should never be stuck in this state.

I tried deleting the postgres app and recreating it, that’s not worked either! I now have no enironments! This isn’t okay!

I have backups of my database via wal-g to restore from but I can’t get any VMs, without VMs nothing is moving!

I’ve tried re-creating it in another region but that hasn’t worked either! Assistance is URGENTLY required!

fly status --all -a $APP shows no instances, I tried fly scale count 1 -a $APP and that hasn’t worked. I really need a resolution on this, very desperately.

To add to this, it is an issue i am also affected by.

Okay, I managed to deploy a new database in iad, am attempting to restore my wal-g backups to it.

I was looking at your app and then it vanished. Looks like you deleted and recreated a new one.

While I was looking at your old app, it seemed to be taking too long to boot up, possibly while recovering?

Right now there’s a 30s grace period when health checks are failing.

You’ll probably want to disable health checks until you’ve been able to restore your database.

After waiting ~30 minutes with no VMs being created in lhr I finally managed to get something to start in iad – I’m now working out how to restore the pg data using wal-g , though I need to be able to STOP postgres on the server to resore the data.

How do I stop it overwriting postgresql.conf and removing the restore_command ?

Edit: I just noticed I need to specify this all via stolonctl and that appears to be working!

1 Like

Right, my issue appears to be resolved.

For anyone else who finds themselves with a similar problem and is using wal-g for backups.

The big caveat is to ensure you have a copy of the superuser password and also set it for flypgadmin after the restore, else you will not be able to use your restored database.