All Upstash Redis instances failing to respond

We are runnning many instances of upstash redis and all of them have stopped responding within the last hour. Is someone seeing similar issues??

I’m getting the same issue, for a few hours now. Everything’s been working for months without issues. Looks like routing to redis is broken because the instances are otherwise up, but I’m getting I/O errors when connecting via proxy.

Per Upstash support looks like they’re investigating an issue in their FRA region. @nukeop assuming you’re in that region?

That’s right, but it suddenly started working again, so I think they already solved it.

redis is rejecting connections for us in SJC as well, ongoing since upstash’s FRA issue started

It seems there is an ongoing issue that’s still being resolved:

Seems the issue is back again since a few minutes in the FRA region…

The issue is definitely back in FRA region.
The majority of our requests to Redis Upstash seem to be failing.

I’ve had intermittent issues today too, but they only last a few minutes at a time.

Yeah, upstash is now reporting the issue on their status page:

Hey everyone, this is Ozan from Upstash.

Here is a quick update on the Upstash Redis issues yesterday and today:

Our servers have been experiencing network issues due to hanging logging calls inside the server code.

After further digging with the Fly team, the root cause turned out to be a bad interaction between a recent kernel upgrade and Cloud Hypervisor bug.

The Fly team now is rolling out a patch and meanwhile we have prepared a workaround to disable the problematic logs.

All looks good now.

Will publish an RCA on the status page as well.

Sorry for the inconvenience!

As a small side note… The official Fly.io postmortem for this is up now:

https://fly.io/infra-log/cloud-hypervisor-stdout/

(A rather extensive entry in the Infra Log, for this particular incident.)