Can't deploy updates to prod app after scaling up DB

We’ve been running an app for a while.
We have other instances running the same container image.
This particular app teamspace-tww has refused and during startup the attempt to ensure database connectivity fails with {:error, "killed"} in Ecto (adapter storage_up for those who Elixir).

We recently scaled up the database and I expect this error has been there since. We have a working version of the app running on one node still since the deploys fail but we want to ship updates to the customer and are rather hindered by this mystery meat error.

This is still an issue. We are digging in but really get no clear details from any of the fly tooling on what the problem is.

All our other environments are shipping the same version without issue, only the one where we scaled up postgres has had this issue.

We’ve tried restarting, thankfully the old version came back on one node. And worked.

Now that node doesn’t seem to work right either. The app is down now :confused:

Figured it out.

Scaling up the database must have given us a new Postgres version. That one suggested to our app to use SCRAM during Auth. That triggered another code path in Postgrex where it tried to use crypto.hmac. This in turn crashed with an UndefinedFunctionError as we hadn’t updated postgrex along with Erlang 24.4.

Rotten luck. Quite fixable. We had to strip out the pre-launch db-create-and-migrate stuff we were doing as they didn’t log any errors for us.