Why does the services.concurrency clause when used with connections not autoscale up?

mayailurus · March 8, 2025, 3:19am

Just to complement @roadmr’s answer…

There’s a newer, heavier-weight mechanism that you can deploy to scale based on more general concepts of load, but you would need to define and report a “current number of subprocesses” metric yourself, etc.

Since you already have a dispatcher in place, I think it would be easier to just modify fastAPI to keep track of the number of running subprocesses internally and then respond with Fly-Replay: elsewhere=true when that count breached the desired threshold…

Topic		Replies	Views
Issue with Autoscaling Based on Request Count in Fly.io autoscaling , proxy	5	71	October 27, 2024
services.concurrency for free tier Questions / Help	9	2707	August 16, 2021
Autoscale doesn't seem to work with hard_limit = 1 and soft_limit = 1	13	1318	September 7, 2021
Fly Not Scaling?	2	392	February 3, 2021
Concurrency connection limits in fly.toml for machines recommended values? Questions / Help machines	2	714	March 24, 2023

Why does the services.concurrency clause when used with connections not autoscale up?

Related topics