One of my apps in AMS region was down today for 15 minutes according to external monitoring. Metrics and logs don’t show anything wrong. Both .fly.dev and main domain via cloudflare returned 403
Our infrastructure does not generate 403 errors, though. Based on that, plus the x-runtime and x-request-id headers, I’m going to say that the error came from somewhere else.
Turns out it was Rack Attack gem in Rails App that blocked all traffic for 5 minutes 3 times in a row. It counted all requests on a single IP (I assume it’s FLY’s proxy) which triggered its throttling rule