Concurrent connections and scaling of servers

yharaskrik · February 6, 2024, 12:42am

Trying to figure out how best to configure the http_service.concurrency setting.

Right now I have type: connections (although I am changing this to requests as these are web servers) and a soft limit of 100 and hard limit of 200.

Here is the config

[http_service]
    internal_port = 4000
    force_https = true
    auto_stop_machines = true
    auto_start_machines = true
    min_machines_running = 2
    processes = ["app"]
    [http_service.concurrency]
        type = "requests"
        hard_limit = 200
        soft_limit = 100

So I would assume that if my app is consistently below 100 requests then eventually my app would scale down to 2 machines. I have 6 total. Instead I am consistently seeing ~4 running. When I look at the concurrency graph I don’t see any one server ever going over 100. Shouldn’t that mean that my server would scale down to 2 machines and chill there until I start hitting the limits?

Thanks!

w8emv · February 6, 2024, 1:38am

This post from @merlin last year looks like it got a good answer about the ‘scaling up’ behavior of hard_limit vs soft_limit, for reference

I’m afraid I didn’t find an equally lucid description of the “autoscale down” algorithm.

yharaskrik · February 6, 2024, 1:43am

@w8emv ah so it looks like it has to do with long lived connections like WS, those may not be showing the in the graph as clearly. That is my best guess based on the description I saw in that resposne you linked.

Topic		Replies	Views
Finding good [services.concurrency] settings without bringing down prod ;-) Questions / Help	3	1197	September 15, 2023
http_service.concurrency is not working Questions / Help	8	572	June 23, 2023
services.concurrency for free tier Questions / Help	9	2735	August 16, 2021
Autoscale doesn't seem to work with hard_limit = 1 and soft_limit = 1	13	1328	September 7, 2021
Why does the services.concurrency clause when used with connections not autoscale up? autoscaling , proxy	13	65	March 15, 2025

Concurrent connections and scaling of servers

Related topics