Reliability: It's Not Great

really appreciate this. Especially recognizing the disparity between the “haha let’s write a fun and sorta snarky blog post about distributed systems and how we know how to do them!!!” while things are totally down or partially down. I don’t have a huge workload on Fly.io yet, but the reliability has been a pretty sore spot for me so far. Hard to think of any other service (compute or not) I’ve used that has had similar spottiness.

That being said, I think if y’all really focus everything on reliability and communication, the upside is still incredibly high. I would personally take an actually-reliable service over basically any other feature offering you’re thinking of delivering or working on right now. For example. the Postgres service / offering - hard to imagine considering it any more than I have already while the core service reliability seems to be pretty low. Just an example, but I think it would apply to any other feature — I’d ask “how can I trust XYZ thing if the basic ability to deliver / deploy a service is shaky/flaky?” Everything gets built on core trust + reliable systems. :hugs:

last thing I’ll say: would rather see actual reliability / a solid service over a blog post writing about it every single time. Actions > words and all that. Ty all and best of luck! :smile:

2 Likes