http_service.concurrency is not working

I have this fly.toml

app = "unkey-api"
primary_region = "iad"

  dockerfile = "./Dockerfile"

  internal_port = 8080
  force_https = true
  auto_stop_machines = true
  auto_start_machines = true
  min_machines_running = 0

  type = "requests"
  hard_limit = 1000
  soft_limit = 800


However in practise there only seem to be going 25 (the default I believe) concurrent requests to a machine before it gets routed to other machines

According to the docs (App Configuration (fly.toml) · Fly Docs) I configured it correctly I believe.

Am I doing something wrong?

Hi @chronark

I think what you might be seeing is just load balancing, which takes concurrency settings into account, but also the current load and closeness.

Our load balancing strategy is:

  • Send traffic to the least loaded, closest instance
  • If multiple instances have identical load and closeness, randomly choose one

You can read more about this in the docs:

Hmm, but I don’t necessarily want to loadbalance to other regions, because my machine in region X can handle way more and I care about the latency a lot.

also what is the point of a setting a soft_limit if it gets ignored?

According to the docs you linked : " Traffic will be sent to instance when it is closest instance that is under soft_limit"

My machine is way below the set limit and still fly is routing traffic away from it. I’m running tests from a single source, close to frankfurt and would expect almost all traffic being served from the machine in frankfurt because I specified to handle up to 800 soft_limit
in reality, there seems to be a hard cap at 25 requests though


Have you already deployed your app with this config? Looking at the config stored on our backend, it looks slightly different:

auto_start_machines = true
auto_stop_machines = true
force_https = true
internal_port = 8_080
min_machines_running = 0

hard_limit = 1_000
soft_limit = 800
type = "requests"

The [services.concurrency] part will be ignored, because it’s not related to http_service, so the app will use the default concurrency settings, which are: type = connections, soft_limit=20, hard_limit=25

1 Like

yes I had that originally and then changed it to use the http_service.concurrency and have since deployed a bunch of times via fly --config=./apps/api/fly.toml deploy --strategy immediate

does that not update the configuration?

fly --config=./apps/api/fly.toml config show    
  "app": "unkey-api",
  "primary_region": "iad",
  "build": {
    "dockerfile": "./Dockerfile"
  "deploy": {
    "strategy": "immediate"
  "http_service": {
    "internal_port": 8080,
    "force_https": true,
    "auto_stop_machines": true,
    "auto_start_machines": true,
    "min_machines_running": 0,
    "concurrency": {
      "type": "requests",
      "hard_limit": 1000,
      "soft_limit": 800
  "checks": {
    "name_of_your_http_check": {
      "port": 8080,
      "type": "http",
      "interval": "15s",
      "timeout": "10s",
      "grace_period": "30s",
      "method": "get",
      "path": "/v1/liveness"

Running this also seems to check out, right?

It does. And I now see the updated config. Are you still experiencing the issue with 25 requests hard limit?

Hmm yeah now it works :smiley:
I can’t rule out that maybe I didn’t save the fly.toml before, so it never got updated correctly.

Thanks for walking through it with me.


This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.