High Firecracker Load Average and unresponsive application process

nickolay.loshkarev · July 5, 2022, 11:48am

We have an Rails app with 2 processes: web and worker (sidekiq). The app is concordia-production-web. Sometimes the worker is stopped. When we discover this, enough time has passed that the logs of this process are lost behind the logs of the web process in history and we can’t figure out what’s going on. When I restart the worker process, it starts again and it works. I noticed an interesting metric:

Maybe you know what this could mean?

P.S. This happened twice. Now I realized that I could try to connect via ssh and see what happened to this app, but this app (4636865b) is no longer running. It seems to me that the first time this happened, I couldn’t connect and got an error.

jerome · July 5, 2022, 12:15pm

I took a look at our logs for the 4636865b instance and it seems like it just stopped logging completely for 25 hours.

Here they are in reverse chronological order (top is most recent):

nickolay.loshkarev · July 5, 2022, 12:42pm

hm… it’s interesting… no errors… Could there be some kind of memory leak? or any other internal problems with our app? I’m trying to understand why metric Firecracker Load Average was at the maximum … obviously these are related things

nickolay.loshkarev · July 8, 2022, 10:53am

Hi @jerome

Could you please help me with a solution? it’s important because this is production

jerome · July 8, 2022, 11:29am

Can you try to fly ssh console into your instance as this is happening?

You can then run top and various tools to find out what’s using so much resources.

mineshp · November 8, 2024, 8:11am

@nickolay.loshkarev, have you found a solution? I’m facing the same issue.

@Team, any guidance would be greatly appreciated!

I’m running Rust, and this issue has been occurring for the last couple of days.

nickolay.loshkarev · November 8, 2024, 2:10pm

Unfortunately, I haven’t. We added a task to restart the app

mineshp · November 11, 2024, 5:25am

Aah

Topic		Replies	Views
High Load Average and Unresponsiveness with Firecracker on Rust 1.77 - Requires CLI Restart Questions / Help machines	1	28	November 15, 2024
Understanding FIRECRACKER LOAD AVERAGE	3	1537	April 21, 2022
Understanding FIRECRACKER LOAD AVERAGE (Part Deux) - shared CPU	2	584	August 10, 2022
Sidekiq workers froze, now deployments are stuck Questions / Help	1	519	May 26, 2022
Rails and Sidekiq issue. How can I view sidekiq logs? Questions / Help rails	4	4241	February 7, 2023

High Firecracker Load Average and unresponsive application process

Related topics