App suspended, can't re-start or re-deploy it

michaell · September 14, 2023, 2:54pm

An app that’s been running for month became suspended at ~11:01p ET last night.

Restarting the app errors:

$ fly apps restart uptime-dazzit-com
Restarting machine 1781965b946689
Error: could not stop machine 1781965b946689: failed to restart VM 1781965b946689: internal: ...

Restarting the machine errors:

$ fly machine restart 1781965b946689
Restarting machine 1781965b946689
Error: failed to restart machine 1781965b946689: could not stop machine 1781965b946689: failed to restart VM 1781965b946689: internal: internal server error

Re-deploying the app errors:

$ fly deploy
==> Verifying app config
Validating /Users/.../uptime-dazzit-com/fly.toml
Platform: machines
✓ Configuration is valid
--> Verified app config
==> Building image
Searching for image 'louislam/uptime-kuma:1' remotely...
image found: img_y7nxpkxm8lnv8w25

Watch your deployment at https://fly.io/apps/uptime-dazzit-com/monitoring

Updating existing machines in 'uptime-dazzit-com' with rolling strategy
  [1/1] Waiting for 1781965b946689 [app] to have state: started
Error: timeout reached waiting for machine to started failed to wait for VM 1781965b946689 in started state: Get "https://api.machines.dev/v1/apps/uptime-dazzit-com/machines/1781965b946689/wait?instance_id=01HAA23BHFET68Z9S1CMCNY95G&state=started&timeout=60": net/http: request canceled
You can increase the timeout with the --wait-timeout flag

Monitoring shows, among other things:

2023-09-14T14:39:50.724 proxy[1781965b946689] yyz [error] machine is in a non-startable state: created
2023-09-14T14:41:08.182 proxy[1781965b946689] lga [error] could not find a good candidate within 90 attempts at load balancing

Mmmm… help?

michaell · September 14, 2023, 3:04pm

A little more info.

That app initially went down at ~22:54 EDT, came up at ~22:58 EDT, and went down again at ~23:01 EDT.

Another app that I have in the same region went down at ~23:04 EDT, came up at ~23:49 EDT, and stayed up.

By “down” I mean: didn’t respond to requests.

So whatever happened to the one app seems to be the result of something that happened in the region.

michaell · September 14, 2023, 10:01pm

Ping?

andie · September 15, 2023, 12:27pm

hi @michaell

Could you try deleting that Machine with fly m destroy <machine id> --force and then run fly deploy again? This shouldn’t normally happen, but the Machine may be stuck in a weird state. Let us know if that doesn’t work.

Unfortunately, apps might have downtime if they only have one Machine and a host reboots or has issues. You can run two Machines and use auto start and stop make sure they only run when needed.

michaell · September 15, 2023, 2:16pm

OK, I’ll look into that. But first I need to make sure that destroying a machine doesn’t destroy the attached volume.

(And the attached volume is why there’s only one machine.)

michaell · September 15, 2023, 2:29pm

Well, forcing the destroy worked. But the new machine didn’t use the existing volume, it created a new one. Now I’m trying to figure out how to attach an old volume to a new machine. Not obvious!

michaell · September 15, 2023, 2:37pm

OK, I cloned the new machine, attaching the old volume in the process, and then destroyed the machine that was closed.

I’m back up.

Thank you!

andie · September 15, 2023, 2:43pm

Glad to hear it!
The new Machine should have picked up the “old” existing volume as long as it had the name specified in [mounts] source. But it’s possible it didn’t have time to sync up between being detached and the deploy… But in any case, the cloning was the next best thing. Happy that it worked!

system · September 22, 2023, 2:43pm

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
App stuck in suspended/starting state with no useful logs	2	303	January 7, 2024
503 error when resuming an app	13	2224	June 23, 2023
My app keeps getting suspended	4	1244	July 17, 2023
App and builder "paused"	15	1210	June 24, 2023
Deploys stopped working since yesterday Build debugging machines	3	68	July 19, 2024

App suspended, can't re-start or re-deploy it

Related topics