Hi,
I’ve been running a small Django + Postgres app for almost a year without issue. A few days ago, without deploying a new revision or any changes from my end, Django started encountering errors when connecting to the database, which brought the whole application down:
File "/usr/local/lib/python3.10/dist-packages/django/db/backends/postgresql/base.py", line 215, in get_new_connection
connection = Database.connect(**conn_params)
File "/usr/local/lib/python3.10/dist-packages/psycopg2/__init__.py", line 122, in connect
conn = _connect(dsn, connection_factory=connection_factory, **kwasync)
django.db.utils.OperationalError: could not translate host name "top1.nearest.of.checkin-db.internal" to address: Name or service not known
The checkin-db
app in my account is ok (it was deployed with fly postgres
). I can SSH into it and look at the tables just fine. I did notice that the user “Fly Admin Bot” set a new env var in the DB application a few days ago (hard to say if the timing matches): FLY_CONSUL_URL
. Could this be related?
Weirdly enough, I’m now unable to SSH into the Django container:
fly ssh console
Connecting to fdaa:0:a484:a7b:7a:609b:5f49:2... complete
Error error connecting to SSH server: ssh: handshake failed: ssh: unable to authenticate, attempted methods [none publickey], no supported methods remain
And in the Monitoring tab in the dashboard I see:
unexpected error: transient SSH server error: can't resolve _orgcert.internal
unexpected error: [ssh: no auth passed yet, transient SSH server error: can't resolve _orgcert.internal]