SJC not starting instances after eviction

I have an app that seems to have been evicted from its previous physical host, but now it’s just stuck with any new deploy sitting in “pending”. Is the SJC data center healthy? Is the volume service there working right?

$ flyctl status --all --deployment
  Version  = 10
  Status   = pending
  Platform = nomad

Deployment Status
  ID          = a53e392a-b066-b42b-c699-404f6afbe130
  Version     = v10
  Status      = running
  Description = Deployment is running
  Instances   = 1 desired, 0 placed, 0 healthy, 0 unhealthy

cee3f768	app    	5      	sjc   	evict  	complete	1 total, 1 passing	0       	2022-08-15T00:39:08Z	

Hi @Tv1

In this instance the host in the SJC region that your instance was running on hit it’s capacity limit and that triggered the eviction of the instance.

Since your evicted VM had an attached volume and volumes are bound to a single host it couldn’t get scheduled anywhere else, which is why it got stuck in pending.

Evictions are a rare and random occurrence and we are looking into improvements and will continue to work on capacity planning across our hosts.

To withstand future hardware failures we recommend deploying critical apps with multiple VMs.

That really sounds like you shouldn’t evict VMs with volumes outside of hardware failures, then…

We should not, it’s a bug. We may have a fix for apps on Nomad (we definitely have a fix for apps on machines, which will be ready any day now).

1 Like