Postgres VM going OOM after some time of bulk inserting

andreyuhai · July 9, 2023, 12:24am

I’ve got a Phoenix app, where I’m also using Oban to do some background processing. I’m inserting Oban jobs in batches of 5K into the DB, however the Postgres VM goes OOM after a while of inserting batches.

I couldn’t really figure out why. First the VM had 1 GB of RAM then I’ve scaled it all the way up to 4 GB but still I get OOMs after a while. Any idea why?

The log example

 2023-07-09T11:11:03.154 app[17811694bd5068] ams [info] [ 4398.099844] Out of memory: Killed process 31237 (postgres) total-vm:1285664kB, anon-rss:9044kB, file-rss:0kB, shmem-rss:854424kB, UID:999 pgtables:2376kB oom_score_adj:0

show work_mem;
 work_mem
----------
 4MB
(1 row)

show shared_buffers ;
 shared_buffers
----------------
 1GB
(1 row)

show maintenance_work_mem ;
 maintenance_work_mem
----------------------
 64MB
(1 row)

Below you can see all the peaks after which it went OOM.

health checks (Not sure why there’s 500 Internal Server Error in vm check)

pg is passing
2023-07-09 11:11:19
	

[✓] connections: 73 used, 3 reserved, 300 max (7.4ms)
[✓] cluster-locks: No active locks detected (22.63µs)
[✓] disk-capacity: 52.7% - readonly mode will be enabled at 90.0% (11.45µs)

vm is critical
2023-07-09 11:44:44
	

500 Internal Server Error
[✓] checkDisk: 8.9 GB (45.3%) free space on /data/ (810.48µs)
[✓] checkLoad: load averages: 0.04 0.13 0.57 (1.5ms)
[✓] memory: system spent 72ms of the last 60s waiting on memory (95.71µs)
[✗] cpu: system spent 1.5s of the last 10 seconds waiting on cpu (30.33µs)
[✗] io: system spent 1.99s of the last 10 seconds waiting on io (32.78µs)

role is passing
2023-07-09 11:11:19
	

primary

andreyuhai · July 9, 2023, 12:03pm

Tried upgrading the primary VM CPU size to 4 as well because I had a message like

Health check for your postgres vm has failed. Your instance has hit resource limits. Upgrading your instance / volume size or reducing your usage might help.
[✗] cpu: system spent 1.5s of the last 10 seconds waiting on cpu (30.33µs)
[✗] io: system spent 1.99s of the last 10 seconds waiting on io (32.78µs)

though I don’t understand what’s wrong with waiting on stuff as long as it’s acceptable and not sure what the IO message should mean.

Upgrading to 4 CPUs didn’t help either, I still can’t bulk insert things in a loop. It just goes OOM eventually.

system · July 16, 2023, 12:04pm

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Logs about OOM? Questions / Help postgres	0	240	August 12, 2022
Phoenix app with deployed Postgres DB OOMs with no writes / reads Phoenix elixir , postgres	2	138	March 11, 2024
OOM for Shared CPU PostgreSQL App Questions / Help	3	311	October 3, 2021
Oban stopped executing background jobs after Postgres image update until app redeployment Questions / Help postgres	0	250	May 25, 2022
Scaling Postgres volume - zero downtime Questions / Help elixir , postgres	3	683	June 2, 2022

Postgres VM going OOM after some time of bulk inserting

Related Topics