Can't deploy updates to prod app after scaling up DB

lawik · September 26, 2022, 2:47pm

We’ve been running an app for a while.
We have other instances running the same container image.
This particular app teamspace-tww has refused and during startup the attempt to ensure database connectivity fails with {:error, "killed"} in Ecto (adapter storage_up for those who Elixir).

We recently scaled up the database and I expect this error has been there since. We have a working version of the app running on one node still since the deploys fail but we want to ship updates to the customer and are rather hindered by this mystery meat error.

lawik · September 29, 2022, 9:20am

This is still an issue. We are digging in but really get no clear details from any of the fly tooling on what the problem is.

lawik · September 29, 2022, 9:27am

All our other environments are shipping the same version without issue, only the one where we scaled up postgres has had this issue.

We’ve tried restarting, thankfully the old version came back on one node. And worked.

Now that node doesn’t seem to work right either. The app is down now

lawik · September 29, 2022, 10:50am

Figured it out.

Scaling up the database must have given us a new Postgres version. That one suggested to our app to use SCRAM during Auth. That triggered another code path in Postgrex where it tried to use crypto.hmac. This in turn crashed with an UndefinedFunctionError as we hadn’t updated postgrex along with Erlang 24.4.

Rotten luck. Quite fixable. We had to strip out the pre-launch db-create-and-migrate stuff we were doing as they didn’t log any errors for us.

Topic		Replies	Views
Deployments can't connect to database Questions / Help	4	319	September 6, 2022
My application and postgres are down postgres	1	217	December 30, 2022
unhealty allocation fail deploy with postgres error Build debugging	0	211	January 22, 2023
Where is our Data? (Postgres Database Volume 0MB) Questions / Help	16	898	December 2, 2023
App No Longer Communicating with Database after today's Outage	6	364	October 23, 2023

Can't deploy updates to prod app after scaling up DB

Related Topics