Outage caused apps to die, but fly never recovered them?

danwetherald · October 13, 2021, 7:10pm

We just noted a few apps randomly died because of redis connection errors to another redis app deployed on fly so they are dead. But nothing brings them back?

Another app that is now dead and wont restart was last seen logging:

Pulling image failed

This happen a few times with no prior errors and then died, won’t recover.

How do we avoid this happening in the future?

danwetherald · October 13, 2021, 7:12pm

fly restart is completely unresponsive, how do we get these apps back up?

danwetherald · October 13, 2021, 7:36pm

The only way I have found in the past to unbrick dead apps is doing a fly deploy - but with a complicated CI/CD setup, this can be a problem to get apps back up ASAP.

kurt · October 13, 2021, 8:56pm

Were these apps + redis instances by chance? We are pretty good at rescheduling apps, but when redis needs to boot first it may not work properly. This is currently our biggest ongoing projects.

A fly secrets set is a simpler way to do the fly deploy process. fly restart just restarts VMs in place, so if there are non scheduled it won’t do anything.

There’s not much you can do about this right this second. It will improve, and last night’s outage was somewhat unique, so there’s a low percentage chance of the same project occurring again.

danwetherald · October 13, 2021, 8:59pm

Gotcha, this was 2 apps, one was connected to redis and started to get errors, then it died.

The other died with the last logs as : Pulling image failed repeated times, but then never recovered.

Topic		Replies	Views
Redis instances shut down? Questions / Help	3	307	June 14, 2022
Long-running app is now dead Questions / Help	3	412	August 4, 2021
Redis deploy failing Build debugging	3	385	November 1, 2022
App seems to freeze and / or get connections to dead instances sticking around	1	316	July 31, 2020
My App's having a Major Outage: Cannot connect to Redis Questions / Help redis	2	224	January 15, 2024

Outage caused apps to die, but fly never recovered them?

Related topics