Upstash Redis connection timeouts (still)

I’m still seeing regular periodic timeouts while trying to connect to Upstash Redis from my Sidekiq processes. This is a paid Upstash plan, and I do not see timeouts like this when running my own Redis in a Fly Machine.

Why is this happening? Is this something to escalate to Upstash, or is this something about running Upstash inside Fly?

1 Like

We’ve also seen an increase with Sidekiq+Redis, also on a paid upstash plan (and a free instance for a staging environment)

Thanks for the reports. When reporting these issues, it’s helpful to know:

  • exact times and regions
  • whether you see timeouts (connection timed out), closures of long-running connections (connection resets) or both
  • if these issues are affecting your workloads
  • if you notice a pattern over time, or if these are isolated incidents that started again recently

For your paid plans, don’t hesitate to write in to support! They are better at tracking issues that might be related, but have not made it to our status page for one reason or another.

Now, in early September, we made a change that eliminates enforced idle timeouts for connections through our proxy. So timeouts should definitely be less frequent than before.

That said, there a few general points worth mentioning about the Upstash Redis service.

Access to the service still runs through our proxy, and the service itself is multitenant, like our proxy. So deployments of either of these services will result in the closure of a long-lived connection, like those used by Sidekiq. This could occasionally lead to timeouts, though that’s less likely than a connection closure.

We can contrast this setup to running your own Redis, which runs inside your network, where our proxy isn’t involved, and where the software itself isn’t dependent on deployments of other services.

If workloads aren’t affected, one can consider these blips a trade-off for the overall reliability provided by managed services. For example, Upstash Redis replicates data across multiple physical hosts in the same region to protect you against physical server issues, routine maintenance, etc. It’s likely that at some point, Upstash has protected your databases from one of these scenarios.

I’m happy to discuss this more here!

Thanks for that explanation! Knowing that any deploys to the proxy or the Fly machine running upstash will close existing connections, I think that is the most likely explanation for this.

Checking back through the exception history, I see exactly 5 reports (perfectly matching size of the entire redis connection pool), at 2023-11-08 22:56:40 UTC and 2023-10-31 21:08:50 UTC, in the sjc region. I’m not sure whether it was a TCP close or reset, since it was reported to me as a RedisClient::ReadTimeoutError exception in Ruby.

I’ll see if I can set up the exception reporter to ignore this kind of error unless it gets more severe.

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.