Just ported a small application today and I’m trying to tune it so that the users are always hitting a fast server. This may be a tweener edge case, but I was wondering if anyone has dealt with this.
Background
- there are between 70-130 active connections at any time.
- the minimum memory required is 512MB
- the softlimit for 512MB is 45 connections, but if the server was sized to 1GB, it would be 100+.
- the current autoscaling is set to balanced min=3 max=8
- the current active regions are sea(20), mia(45), and ams(20).
The issue is that between 3 servers, we are getting 45, 20, 20 connections. When a single server hits the soft-limit, the requests are routed to the alternate servers which adds roughly 100ms to each request. And there will be no autoscaling until all 3 servers would need to hit 45 concurrent connections.
I’m not really clear what to do in this case. The options I see are:
- Lower the soft-limit. We would still route any excess soft-limit to other servers, but at least it would autoscale into the best region quicker.
- Set a higher minimum count for all regions. When the soft-limit is hit, it would re-route to a closer server.
- Increase the memory to 1GB which is overprovisioned, but at least no re-routing. Also, then there would be no autoscaling ever.
- Is there a hidden option to autoscale when a single server hits the soft-limit rather than the entire set of servers hitting the soft-limit?
Anyone have any recommendations?