I am encountering the following error without changes to my rails application in the logs:
ActiveRecord::DatabaseConnectionError (There is an issue connecting with your hostname: top2.nearest.of.<my-app-db>.internal. Please check your database configuration and ensure there is a valid connection to your database.):
Rails version 7.1.3.4
Postgres version: (on Fly) 14.6
Both my rails application and my database are running without issue in Fly and the rails application redeploys without issue (passing health checks). I have recreated the DATABASE_URL secret for my rails app by running fly postgres attach <my-app-db> --app <my-app> --database-name <my_app_database_name> --database-user <new_app_database_user> which gave me a fresh database user which can connect to the database without issue (verified by running fly postgres connect -a <my-app-db> -d <my_app_database_name> -u <new_app_database_user> -p <password-generated-by-fly-postgres-attach>) and the app still successfully redeploys but I have the same error as above.
No new logs were created good or bad on the db side when attempting to access it from the app side. However I have now noticed that my application is now up and running again without error and without me having changed anything…
As of now my app is down again with the same error without me changing anything again. While this is an issue I checked the logs on the database app side and no new logs are being created at all despite the host errors on the application side.
I can’t see any issues on the status page https://status.flyio.net/ but it looks like this might be a networking issue of some kind…? If anyone knows any way to troubleshoot this that would be great
One thing to try is cloning a Machine from your Rails app into a different region. That would help determine how localized the problem really is. The ams region had a glitch once where servers were reachable from one direction but not another, for example.
You can also SSH into an existing Rails Machine and dig AAAA top2.nearest.of.<my-app-db>.internal from there.
Thanks for the help - I cloned into the AMS region and opened up a rails console and, sure enough, I can access the DB just fine. Doing so on the main LHR region app I had is having issues so this looks like an LHR region issue specifically on the application side.
have you looked at the logs of your database app? there are “drop user” as well as “password authentication failed” messages that may explain your issues.
Hi @gideonbrimleaf, do you have insight into what is the underlying error that caused the ActiveRecord::DatabaseConnectionError? The error message only says there is an error, but it seems unclear whether this is a DNS error or some kind of TCP connection error. Checking whether dig AAAA <hostname> works as mentioned above also helps.
again @gideonbrimleaf, sorry for the inconvenience, we recently deployed an update to our DNS stack, and your app seems to have exactly run into a corner case – your main app and the database landed on the same physical host, and that triggered a bug with topN.nearest.of resolution. This has now been fixed, you should not be seeing this consistently failing again.