Fly down?

ignoramous · January 21, 2023, 10:32am

All valid points. Things shouldn’t break as often they do. There’s isn’t much clarity at all sometimes (ex). And from observation, Fly seems to have a culture of releasing quickly and not really over-engineer stuff. I’ve called on them for more deliberation and the need to respect the scale of their operations once before, but it isn’t all that bad either.

…when the book hits the real world sometimes you find new failure modes, software has bugs, or humans find creative mistakes. It’s also very hard to build global scale systems with zero possibly of global failure. But every time a crack is found, you learn something and do what you can to eliminate the whole class of related failure modes.

That’s a comment from a Googler on GCP’s global outage: Obviously not authorized to release more details than have already been made pub... | Hacker News (2019).

Speaking from personal experience, I was in the team when DynamoDB (2015), Elasticsearch Service (2018) went down nearly globally (it was just IAD for both, which is also the “primary” region which meant a lot of other unexpected things also happened)… CloudFront also faced its own share of terrible outages over the years and the learnings from it were distilled into an internal-only talk at the time, which was so popular within AWS that it was eventually presented at re:Invent 2016: https://youtube.com/watch?v=n8qQGLJeUYA

This stuff isn’t simple to accomplish for a team as small as Fly. I mean, the CEO is still replying to customer support emails and forum posts.

I’m confident once they staff up, things will considerably improve.

Topic		Replies	Views
Fly.io apps down in production	3	325	October 17, 2022
Fly is down?	10	598	March 12, 2022
Something went wrong? Questions / Help	42	1503	September 22, 2022
Service unavailable? Unable to deploy django app or login	18	577	September 16, 2023
fly.io site is currently inaccessible...	83	3233	December 5, 2024

Fly down?

Related topics