We’ve simplified our public status page

We have simplified our public status page to better answer one specific question - “Hey, my app is being wonky, is it me or Fly?”. Someday we’d confidently say “IT IS YOU”, but we are not there yet.

First, there is “Customer Applications” status above all to better distinguish incidents that affect your already running machines from incidents that prevent making changes such as creating, deleting or modifying machines. While both of them are important, breaking the former is far more worse than breaking the latter. We want to make sure that you won’t miss this specific type of incident.

Second, we’ve deprecated “Platform and Tools” and moved most of the sub-components (API, Deployments, Remote Builds, …) to the top-level. This group has been a kitchen sink and has been accumulating crafts over the years. We even had “Upstash Redis” under here and Extensions somehow.

Last but not least, we have removed the elements that wouldn’t answer the “is it me or Fly?” question, from 90 days uptime to system metrics graphs. You can see our past incidents from Incident History. System metrics graphs are gone. In hindsight, we should watch these numbers, not you. They are not really actionable.

Sorry for the back and forth. Let me know if you have any questions!

10 Likes

Bad link? http://fly.io/?%E2%80%9D

Yes. Fixed.

Why doesn’t the fly status page show the issues folks including myself were/are seeing?

Can you be more specific? The personalized/account status page should show very granular information about stuff that’s broken or stuff that we’re doing maintenance on that affects your apps.

If you don’t see anything there, and you don’t see anything on the global status page, either (a) everything is fine on our end or (b) something is broken we have not detected yet or (c) you’re hitting a bug.

There’s a few issues that are getting 500 errors, eg: Can't deploy or increase RAM. Getting request returned non-2xx status, 500
fly machine * returns 500 error code without details

This person had issues 8hours ago (like me) Elixir deployment :nxdomain

Last night, around 9PM PT, my database connection failed due to a 500 error - specifically turso which runs on fly. Things are working for me now, so it’s either a turso or a fly issue (likely the latter)

Sorry for the inconvenience. In this specific case, it was b + c. You were hitting a bug, one of our services was broken and our detection mechanism had a gap.

The fix has been deployed. Let me know if you still have the issue.

1 Like