I just “tuned” a PHP app. I don’t know if my approach was right or wrong, but maybe it’s helpful.
Ultimately, what you’re trying to figure out is how much concurrency the app can handle without blowing out any key resources — CPU, disk, memory, network.
I set the hard_limit
low, and put the app under sustained load — say, at least two to three minutes.
In my case, disk, memory and network clearly weren’t bottlenecks, so I focused on CPU.
I noted the max CPU utilization under load, and if it was under 100% I made a guess about how much more work the app might be able to do. I increased the hard_limit
and ran the test again.
Ultimately, what I wanted was for CPU utilization to be as high as possible without hitting 100%. Because if I hit 100% the app would obviously start to fall further and further behind.
Indeed, the “sustained” part of “sustained load” is very important. What I kept seeing is low CPU utilization initially, followed some time later by a ramp up as the app started to fall behind. If the concurrency wasn’t set too high, it would eventually stabilize again. If it was set too high… Destruction. Carnage. Chaos.
On a 2 core shared CPU, my soft_limit
is 10 and my hard_limit
is 20. That gives me ~185 RPS under sustained loads of up to 30 simultaneous connections. With 30 connections the average response time is ~170ms, with a large standard deviation.
I should point out that the URL I was testing here purposely avoided any caching; I was trying to get a feel for something like “worst case.” Hard to say how much caching would improve the results. But it’s still a PHP app, not Go or Rust; there’s a limit to how fast it can get!