Issue with Autoscaling Based on Request Count in Fly.io

brunocs90 · October 16, 2024, 12:42pm

Hi everyone,

I’m having trouble setting up autoscaling for my Fly.io app based on the number of requests. I’ve configured my fly.toml as follows:

[http_service]
  internal_port = 8080
  force_https = true
  auto_stop_machines = true
  auto_start_machines = true
  min_machines_running = 0
  max_machines_running = 2
  processes = ['app']

[processes]
  app = 'uvicorn index:app --host 0.0.0.0 --port 8080'

[[services]]
  internal_port = 8080
  protocol = 'tcp'
  processes = ['app']

  [services.concurrency]
    type = "requests"
    hard_limit = 3
    soft_limit = 1

  [[services.ports]]
    handlers = ['http']
    port = 80

  [[services.ports]]
    handlers = ['tls', 'http']
    port = 443

[[vm]]
  memory = '4gb'
  cpu_kind = 'shared'
  cpus = 4

Issue:

I’ve set the concurrency type to “requests” with a soft limit of 1 and a hard limit of 3, expecting that Fly.io would automatically scale up machines when the number of requests exceeds these limits. However, my app is reaching high CPU usage (almost 100%) under load, but no additional machines are being launched.

I also set max_machines_running = 2, but it seems like new instances aren’t being spun up when the app hits the soft or hard limits.

What I’ve Tried:

Setting min_machines_running to 0 and max_machines_running to 2.
Adjusting the concurrency limits.
Monitoring the app’s performance using flyctl monitor and checking logs.

Question:

Is there something I’m missing in my configuration? How can I ensure that my app properly scales up when the number of requests increases and prevents the CPU from reaching such high usage without scaling new machines?

Below is the test I did hoping the scale would work

Any help or suggestions would be greatly appreciated. Thank you!

mayailurus · October 16, 2024, 3:25pm

Hi… Yes, although it’s not obvious at first, . You have overlapping definitions of port 8080, both [http_service] and [[services]] blocks, and this tends to confuse the Fly.io infrastructure.

Try removing the [[services]] parts and putting your concurrency definitions under [http_service] instead.

Hope this helps a little!

mayailurus · October 16, 2024, 3:26pm

Added autoscaling, proxy

khuezy · October 16, 2024, 3:31pm

I don’t think this exist? I don’t see it in the docs. As far as I know, Fly won’t create a new machine for you. You have to run fly scale count N for it to scale up to the max number of machines per region. Then it’ll scale up from min_machine_running to N

brunocs90 · October 20, 2024, 6:53pm

Thank you, that was exactly the problem!

This way it works perfectly.

[http_service]
  internal_port = 8080
  force_https = true
  auto_stop_machines = true
  auto_start_machines = true
  min_machines_running = 0
  processes = ['app']
  [http_service.concurrency]
    type = "requests"
    soft_limit = 1
    hard_limit = 3

system · October 27, 2024, 6:54pm

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Why does the services.concurrency clause when used with connections not autoscale up? autoscaling , proxy	13	46	March 15, 2025
Autoscaling auto_stop_machines not working appsv2	10	849	October 16, 2023
Auto Scaling - The threshold of when to scale up. Questions / Help docs	7	1048	August 18, 2022
Fly Not Scaling?	2	392	February 3, 2021
Using fly.io as an alternative to AWS Lambda Questions / Help	6	2058	June 21, 2023

Issue with Autoscaling Based on Request Count in Fly.io

Issue:

What I’ve Tried:

Question:

Related topics