Deploys to IAD are failing.

We have 3 instances: iad, syd and lhr.
Syd and Lhr are working perfectly.
But IAD is not:

ID      	PROCESS	VERSION	REGION	DESIRED	STATUS 	HEALTH CHECKS     	RESTARTS	CREATED
df3f59c4	app    	380 ⇡  	syd   	run    	running	1 total, 1 passing	0       	3m2s ago
60dbedf2	app    	380 ⇡  	iad   	run    	pending	                  	0       	4m1s ago
a408d337	app    	380 ⇡  	lhr   	run    	running	1 total, 1 passing	0       	4m1s ago

I tried many times, but still iad not working for me:

v377 is being deployed
52de2672: lhr pending
95a3872c: iad pending
52de2672: lhr pending
52de2672: lhr running unhealthy [health checks: 1 total]
52de2672: lhr running unhealthy [health checks: 1 total, 1 critical]
52de2672: lhr running healthy [health checks: 1 total, 1 passing]
15532f62: syd pending
15532f62: syd pending
15532f62: syd pending
15532f62: syd running unhealthy [health checks: 1 total, 1 critical]
15532f62: syd running healthy [health checks: 1 total, 1 passing]
--> v377 failed - Failed due to unhealthy allocations - rolling back to job version 376 and deploying as v378 

I have run fly restart -a brandkit.
But the same happens. Iad is in pending state forever.
Thoughts?

1 Like

Hi @nicanorperera, sorry about the issues, your allocation was stuck on the IAD worker. I’ve restarted it and it looks good now.

3 Likes

Update:

After restarting one more time, IAD finally worked for us.
But it took 20 minutes of waiting.

This is still a problem for me, though.
I won’t be able to deploy if it takes so much time for the app to be instantiated in iad.
I have my write database in iad, and I’m using fly_postgres (Elixir application)
For some heavy write operations we call iad instance via RPC.
I need iad to be always on. If Syd or Lhr are down, it’s not a huge problem.
But iad is very important for us.

Anybody knows why this has happened and if there is anything I can do to prevent it from happening again? Failures like this have been happening very often lately to us.

Oh, thanks dusty. Just saw your answer.
Is that something I could have done myself?
if so, how?
Thanks in advance!