Phoenix app with deployed Postgres DB OOMs with no writes / reads

Hi folks,

I created an Elixir / Phoenix app about a month ago that has a Postgres database attached, and I’ve just gotten an email informing me that the db application has crashed from running out of memory. The wrinkle is that I’m not doing any writing to or reading from the database at this point. I don’t know anything about what’s running the Postgres app, but my guess is that the memory leak is from the logging or metrics code, or possibly the leader election (there’s no replicas for the free tier Postgres application).

A few questions / out-loud thoughts:

  1. Will the crash be cyclical, since the database app isn’t actually doing any work yet? (Let me stress that this is for a toy side project - having the Postgres app restart itself every 30 days is not a concern, just thinking out loud).
  2. Is there something wrong with the specific Elixir / Erlang pair I’m using? That might explain why it seemingly hasn’t happened to other developers? I didn’t see any posts about OOM errors occurring with similar lack of actual querying.
  3. Is the instance useful for Fly engineers to poke at, if they want to figure out if it’s their monitoring or leader / follower election code is memory leaking?

I’m happy to leave the instance alone and work in a different repo if it’s helpful to have a copy of the Postgres app in the “broken state” if that’s useful (and this is my explicit blessing that it’s completely fine to do so). Then again, I can also imagine there are more exciting things to work on than a 6MB/day memory leak causing very occasional crashes for free tier users. Absolutely no worries either way.

The repo is here (which I will leave untouched for at least a week): GitHub - sgardn/unprompted: a collaborative writing game built with elixir

# .tool-versions
elixir 1.15.5-otp-26
erlang 26.0.2

Open to theories and suggestions for debugging / learning more, but as I’ve mentioned, the database isn’t doing any actual work, so I’m not entirely sure where to dig first. Thanks!

Hey @guycombinator ! Thanks for providing all these details about your set up. The good news is, it’s nothing on your side that caused this :slight_smile:

Earlier today we shipped an update to a monitor process that runs alongside the VM. This had a memory leak which resulted in some apps going OOM. This impacted Postgres apps the most. As Postgres tends to utilize all available memory it’s more vulnerable to something taking up excess.

We’ve rolled back the change so it should be fixed shortly.

1 Like

thank you for taking a look into it.

I am having the same issue. The free postgres instance will slowly increase the memory usage and then it will get OOM’ed.

Let me know what details i need to provide so that debugging will be easy for you guys. no hurries.