VM Service Concurrency: what do these numbers mean?

Hey all, I’m seeing some numbers in my metrics panel which are a bit of a concern:

For the past day this one server (44441044) seems to be showing a consistently high figure.

For the previous 24hr period I’m seeing more distributed numbers:
Screen Shot 2021-10-07 at 16.21.23

Should I be concerned that one server seems to be doing all the work based on the current numbers? My response rate seems to be pretty consistent (barring issues) over the last few weeks.

Would be great to get a bit of an explanation of this graph in the docs.

Hey there!

Your concurrency settings are 1000, 4000. Our proxy only knows if your app’s load is either between 0-1000 or 1000-4000 or exactly 4000.

This means, when concurrency is under 1000 for all instances, we’ll send it to the closest instance (or if there are more than 1 instance in the same region, we’ll randomly balance between them).

In your second screenshot, the load is better distributed because we know there are instances that should take some of the load, even if they’re further away.

In your first screenshot, 1384 is indeed over 1000, but that’s a peak and sometimes the load numbers aren’t distributed fast enough for the proxy to react. Bue you see right after the peak there’s a lower concurrency value and another instance has taken more load (the lighter green one).

You could tweak our soft and hard limits if you want us to balance between instances when traffic is lower. This might mean we’ll balance to further-away regions.

We did make some changes on how we measure the closeness of instances and that’s maybe why load is not well distributed between instances in the same datacenter. I can look into that. It’s possible the same client is hitting the same server over and over again, this would make an instance on the same server more likely to get the load.

Ah, ok. I’ll try tweaking the setting a little to see whether I can introduce a better balance.

As an FYI then @jerome, the server that is being hit there is in AMS, but according to the traffic report in my September bill, AMS isn’t a primary region:

I deployed a few changes that should help with load balancing between instances in the same region.

Re billing: I’m not entirely sure what’s up :thinking:, we’re looking into it.

1 Like