Concurrency without autoscale

I have a doubt about the service.concurrency parameters and how they behave when autoscaling is disabled. As far as I understand those limits are used by autoscaling and when the hard limit is reached the incoming requests are queued. I’m not using autoscaling because from your documentation it seems to have been deprecated in favour of manual count scaling (where I decide how many machines, where and how big). If there’s no autoscaling why are request queued when that limit is reached? Should I remove that section from my configuration and if I do, what happens? If I can’t remove that section, what’re the best values?


Autoscaling will still work fine! We shouldn’t have marked it deprecated in the docs.

The soft limit will control load balancing, we try and distribute things enough to keep every VM at or near the soft limit under load.

The hard limit will still take effect. If you don’t want it to ever queue, set it to a very high value.

Thanks for the info @kurt. I think I’ll do some experiments with those value then. Thanks again