Post Mortem FRA Incident JAN2022

We’ve learned a lot after the OVH Outage and resulting Issues with our Apps on FLY. This Outage of the FRA Region (leading to very high latencies for our primary user group located in Germany) did not critically impact us. We just disabled all LB Hosts from FLY for 3 Days.

Luckily we have fallback Hosts located on AWS & GCP in 4 more Regions. Sadly we can’t entirely rely on FLY.

Can you please post a Post Mortem about the Issue and why it took so long to fix?