Concurrency hard limit not respected?

zakthompson · May 27, 2025, 3:15am

Hi there, everyone! We’re experiencing some weird behaviour around load-balancing/concurrency that I’m hoping somebody has some insight about.

We were noticing that occasionally our Rails app was throttled due to going over CPU balance, causing long queue/response times. But we run several machines, and we noticed that when this happens, most traffic seems to all be hitting a single machine, and only that machine struggles under the weight.

To mitigate, we configured autoscaling via starting/stopping machines, with explicit soft and hard limits:

[http_service]
  processes = ["web"] # this service only applies to the web process
  http_checks = []
  internal_port = 8080
  protocol = "tcp"
  script_checks = []
  force_https = true
  auto_stop_machines = "stop"
  auto_start_machines = true
  min_machines_running = 5
  soft_limit = 10
  hard_limit = 15

Yet this hasn’t mitigated the issue at all - we’ll still often see a single machine experiencing high concurrency while other machines sit idle:

Furthermore, as can be seen above, it never seems to flag the machine as having hit the hard limit.

I’ve theorized for a while that these spikes may all be requests coming from a single client, as some load balancers will prioritize sending requests from a single client to a single machine. But I was under the impression that the hard limit should prevent even this case - that if concurrency hits the hard limit, the load balancer should prevent any more traffic from being routed to that machine.

It’s clear that I’m misunderstanding something. Can anybody explain why our hard limit doesn’t seem to be respected, and perhaps suggest how we might mitigate our problem?

mayailurus · May 27, 2025, 4:33am

zakthompson:

[http_service]
  processes = ["web"]
  http_checks = []
  internal_port = 8080
  protocol = "tcp"
  script_checks = []
  force_https = true
  auto_stop_machines = "stop"
  auto_start_machines = true
  min_machines_running = 5
  soft_limit = 10
  hard_limit = 15

Hi… This might just have been a copy-and-paste glitch in the above, but limits actually need to go under a special sub-section:

[http_service.concurrency]  # ←
  soft_limit = 10
  hard_limit = 15

(I wish flyctl itself would warn about these.)

Effectively, you had ∞ hard limit before…

Hope this helps!

zakthompson · May 27, 2025, 4:48am

Well, I do feel foolish! I was setting this up from the autostart/stop docs which I don’t think made it very obvious where these should go - I should have dug deeper into the specifics! I’ll update this and see if it solves the issue

I am still fairly curious about the load-balancing behaviour and how we end up in this situation in the first place, though. I find it interesting that we do still trigger the default soft limit (shown in the graph above) and yet so much traffic gets routed to the one machine. Is my suspicion about it all being traffic coming from a single client the likely culprit?

mayailurus · May 27, 2025, 6:10am

Hm… That wouldn’t be my own guess; as far as I know, the Fly Proxy has no “stickiness” mechanism like that.

If you still see a big imbalance within a single region after changing fly.toml, try posting the multi-instance graphs (i.e., all Machines on the same chart) along with a detailed list of which regions you are using and what counts you have in each one—and things of that sort.

system · June 3, 2025, 6:11am

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Load balancing with the concurrency soft limit parameter	1	634	February 23, 2022
Autoscale doesn't seem to work with hard_limit = 1 and soft_limit = 1	13	1328	September 7, 2021
http_service.concurrency is not working Questions / Help	8	570	June 23, 2023
Autoscaling is constantly stopping and starting instances even with absurdly high soft_limit of 100k Questions / Help metrics , autoscaling	3	79	February 7, 2025
understanding load balancing autoscaling , proxy	11	248	September 23, 2024

Concurrency hard limit not respected?

Related topics