We’ve bumped the Redis idle timeout to a day (instead of one hour) which should help with debugging. If you’re seeing timeouts throughout a single day after a deployment, do report back here.
Will need to check; perhaps the error is on the rails side when sending a job. In either case moved to self hosting as it was causing too much of an issue.
After 24 hours of the ping being in our health check, we had no reconnections or problems. (The idle timeout looks to have been changed a few hours after, it was fine before then though.)
I’ve opened an ioredis issue since there are a few things there:
I don’t know if there are any details that others could provide; please do if you can!
I tried a variety of things with my ioredis settings but continued to have re-connection problems. This seems to be a recurring theme in ioredis GitHub issues and various other discussions online. Interestingly I also discovered that BullMQ (I’m using older Bull) explicitly checks if you are using Upstash and throws an error immediately if so as it doesn’t support them. In short… it’s all a bit of a confusing muddle.
So… I have put a ping to my Redis connection into a healthcheck call like thewikybarkid did. Since doing that it’s been OK, but it’s only been 24hrs so fingers crossed still.