I just experienced a very weird cpu spike that caused me to use up all the spike balance and to go into throttle mode.
I understand that I have a memory leak somewhere which I’ve been investigating already. What i don’t understand is where the cpu spike is coming from.
My best guess right now is that a frequently accessed object coincidently got allocated to the swap portion of the memory, suddenly increasing disk utilisation, which in turn caused cpu spike.
My reasons are,
swap has been in use since much earlier than the cpu spike
responses count indicate that I did not experience a traffic spike
I’m no expert in this so I’m not sure if it’s correct to assume correlation between cpu utilisation and disk utilisation.
You want the si (“swap in”) and so (“swap out”) columns to be mostly zero.
(Those are measured in megabytes per second, due to --unit M, in this version; older versions may report differently.)
I created a test program in Racket that constructed a 400MB array on a Machine with only 256MB RAM, and then iterated back and forth across it several times. The si and so numbers were up around 10MB/s for a couple minutes straight, .
Having said that, I think the memory leak itself is really the highest priority. You mentioned in an earlier thread that you only have 512MB of swap in the first place, so the leak has consumed the entirety of that. That is nearly guaranteed to cause poor system performance—one way or another.
The question of what exact mechanism is causing this to manifest as CPU steal is genuinely interesting, but I suspect that the answer will have the following structure: several paragraphs of dense explanation of VirtIO ring buffers, swap daemons, custom cgroup throttling, overlay storage systems, noisy neighbors, Node.js garbage collection, and fiery muppet backgrounds—finally ending in the sentence, “Consequently, definitely fix that memory leak”…
The leak was on this particular version and has been fixed since.
I was pondering since, well, I expected oom crash after using up all the swaps, which would have given me an email and then auto start. Instead, it kept running ..slowly..