Autoscaling auto_stop_machines not working

Hey,

I got the following configuration for my v2 app:

[[services]]
  internal_port = 8080
  processes = ["app"]
  protocol = "tcp"
  script_checks = []

  auto_stop_machines = true
  auto_start_machines = true
  min_machines_running = 2

  [services.concurrency]
    hard_limit = 4000
    soft_limit = 2000
    type = "requests"

I have created 9 machines with flyctl scale count 9 --region iad,lhr,syd,dfw,cdg,fra --max-per-region 2 but even though the number of requests is way under the limit (One region with ~50 req/s and the others < 20 req/s) none of the machines are ever stopped. So far it is always running the max number of machines.

Is there a problem with my concurrenc configuration?
The documentation is unclear if the limits need to be set in requests per minute or requests per second. I have set it as req/min for now but that has made no difference.

2 Likes

Your configuration looks correct. It appears that after stopping your apps, they immediately start again (you can see that happening in your logs). This is preventing them from staying stopped.

We’ll look into it to see if it’s due to changes on our side

Thanks for looking into it.
Could you also confirm if the request limit is in req/min or req/sec?

Hi @Wayrunner

The concurrency limits refer to the number of simultaneous requests, so it’s not per minute or second. It’s the total number of requests at any given moment.

Edit to add: so for example, if you know your app (with its current CPU and memory) can handle a maximum of 100 simultaneous connections, then that would be your hard_limit. You’d set your soft_limit slightly lower so that the proxy can start sending requests to another Machine before that max load is reached. The concurrency settings are used to auto start and stop, but they’re also used in load balancing.

2 Likes

Hey @senyo, Did you find out anything else?

I’m still looking into it. There’s one bug we’ve caught which causes the error you’re seeing and that’s in progress to being fixed. I’m not entirely sure it will resolve your problem however. We have other hypotheses for what may be causing it which we need to test. So still working on it, hopefully we should have an answer for you soon!

1 Like

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

That bug has been fixed.

@Wayrunner are you still having this issue?

We appear to have this exact same issue, as soon as a machine is downscaled it immediately starts again, preventing any real downscaling.

Looks like a regression here; I created a new post here with some details: Machine auto stop/start flapping. Our issue does self-heal though after some hours.