Load balancing within a region

Our load balancing strategy boils down to: send a request to the least loaded, closest, instance. If many instances have the same load and closeness values, then it randomly picks one in the set.

Load is determined by the concurrency limits and how many connections an instance is currently serving. If your concurrency limits are set to the default (soft: 20, hard: 25), then this happens based on how many connections are established:

  • 0 - 20 - “under soft limit” - we can send a request there
  • 20-25 - “over soft limit” - we’ll only send a request there if no other instance is under its soft limit
  • 25 - “reached hard limit” - we’ll never send a request there

If your app is not too busy, it’s likely all your instances fall in the “under soft limit” bracket and they’re all good candidates for a request.

Closeness is determined by RTT (round-trip time) between our edge node and the worker node where your instance runs. Even within the same region, we use different datacenters with different RTTs. These RTTs are measured constantly between all servers.

Looking at your specific app, I see it spawns across multiple datacenters in DFW. One of them has a slightly better ping than others from the vantage point of the edge I’m testing from. This means it will be chosen every time unless your instance reaches a higher load bracket.

This behaviour might sound odd, but it’s the fastest way to get to your app. Even if it saves only 2ms, if your app is under its soft limit then that means it’s as responsive as any other instance, but it’s closer so we might as well send a request there.

The way to affect the behaviour of our load balancing here would be to change your concurrency limits. If you create a bigger distance between the soft and hard limit and if you lower your soft limit, then your instances are more likely to fall in the “over soft limit” bracket. If this happens, instances which have not yet reached their soft limit will be prioritized and requests will be balanced between close-by servers.

2 Likes