Load balancer documentation

Is there any documentation on how the load balancer work? Let’s say that I have multiple instances running in the same region, how will the load balancer determine which instance is hit?

We don’t have any docs on load balancing, I don’t think. It’s relatively simple, here’s the priority list:

  1. Send request to the nearest VM under the soft limit set in the app config
    a. If there are multiple options, take two least loaded and pick one at random
  2. If all VMs are over soft limit, send request to VM that’s under the hard limit set in the app config

There’s extra logic in there to retry different backends, but that’s mostly transparent.

Thanks, this is the information I was looking for! One more question: how does the load balancer take the two least loaded? Is it using load averages, cpu or something else?

It’s using the number of connections / requests as a metric for least loaded. Eventually we’d like to be able to load balance based on other metrics.

1 Like

is there a difference for services.concurrency: type = "connections"?