Last summer we were trialing Fly.io with a prototype app, and everything went great except that our app would, after several weeks of being idle, reliably drop its PostgreSQL connection. This would manifest as an error upon visiting the app for the first time after several weeks of not using it, and then the connection would recover after a few minutes, and everything was “fine” again.
Now a year later, we’re considering using Fly.io in production and have deployed our beta app. Unfortunately, the PostgreSQL connection drops are back, but now they’re worse: they occur after just a few hours (update: minutes, actually – see below) and reloading the app in the browser does not reconnect the backend to the PostgreSQL instance. We just get constant no connection to the server
errors until we restart the app by hand.
This is disconcerting.
I note that there have been some other recent threads about similar issues with PostgreSQL:
- DB Connection issues
- Postgres connection issues
- PostgreSQL "Connection is closed" error after a few minutes of activity
Two of these threads don’t have resolutions, and the one that possibly does (the last one in the list) indicates that perhaps it’s the responsibility of the app to use keepalives somehow? Is that the official response?
In any case, I would appreciate it if someone from Fly.io could look into this. Our app works fine and stays up for days when deployed locally to a PostgreSQL instance running in Docker, so I’m pretty certain the issue is not with our app.
One question: at the moment, our app does not accept connections on IPv6, only on IPv4. Could that be the problem? I know that Fly.io use a lot of IPv6 internally. On the other hand, the app does connect fine initially, so lack of IPv6 doesn’t seem to be an issue just after the app’s been restarted.