App stuck at pending state after an OOM error

I have a redis instance (empty-smoke-2580) that was killed after an OOM error :

2021-07-30T14:38:32.642944691Z app[281eb1b6] cdg [info] [    3.884091] Out of memory: Killed process 526 (redis-server) total-vm:1234408kB, anon-rss:937892kB, file-rss:4kB, shmem-rss:0kB, UID:0 pgtables:2092kB oom_score_adj:0
2021-07-30T14:38:32.705837738Z app[281eb1b6] cdg [info] Killed
2021-07-30T14:38:32.911454420Z app[281eb1b6] cdg [info] Main child exited normally with code: 137
2021-07-30T14:38:32.912373574Z app[281eb1b6] cdg [info] Starting clean up.
2021-07-30T14:38:32.929419701Z app[281eb1b6] cdg [info] Umounting /dev/vdc from /data

I tried to restart the app and scale it to 1 vm again but it’s now stuck in “pending” state.

Wow that’s a pain. Let me see what’s up.

It looks like it kept OOMing when it boots and tries to apply the AOF log.

526:M 30 Jul 2021 15:20:50.427 * RDB memory usage when created 562.40 Mb
526:M 30 Jul 2021 15:20:50.427 * RDB has an AOF tail
526:M 30 Jul 2021 15:20:52.547 * Reading the remaining AOF tail...
[    3.433183] Out of memory: Killed process 526 (redis-server) total-vm:1234408kB, anon-rss:937376kB, file-rss:0kB, shmem-rss:0kB, UID:0 pgtables:2100kB oom_score_adj:0

I temporarily bumped it to 2GB of memory and it started fine. If you want to keep it at 2gb, run fly scale vm shared-cpu-1x --memory 2048, if you want to drop it back down to 1gb, you’ll likely need to clear some data out first.

1 Like