Autoscale doesn't seem to work with hard_limit = 1 and soft_limit = 1

matthewrobertbell · April 23, 2021, 8:20pm

Hi,

My web service only handles one request at a time, by design.

In my config I have:

[services.concurrency]
  hard_limit = 1
  soft_limit = 1
  type = "connections"

I have run flyctl autoscale standard min=1 max=10, but whenever I send multiple requests to the service, autoscaling doesn’t seem to happen, and requests beyond the ones handled by the initial VM time out. What do I need to do differently?

Edit:

I also see “lhr” listed under both the region pool and backup regions when running flyctl regions list, could this be a bug?

kurt · April 25, 2021, 12:08am

Ah this is a quirk of how we do autoscaling. It’s metrics based so there’s a bit of lag when we add nodes. Which means setting a limit of 1 won’t spin a new VM up in time.

If you give me your app name we can opt you in to new autoscaling that’s quite a bit quicker to respond. It will still take ~15s to boot a new VM but you might have fewer timeouts.

matthewrobertbell · April 25, 2021, 7:23am

Sounds good, I have sent you a message with the name, thanks!

kurt · April 25, 2021, 3:20pm

Ok you should be all set! It’ll at least scale quicker now.

matthewrobertbell · April 25, 2021, 4:52pm

I tried initiating 8 requests using curl from my laptop, 2 were processed by the initial VM, 6 timed out, and no new VMs were launched (from looking at flyctl status), any idea why they aren’t launching?

kurt · April 25, 2021, 4:56pm

You’ll probably need to go over the soft limit for >15s. The simplest test is going to be one long lived connection and repeated attempts at a second until it scales.

Also I want to reiterate that we’re not good at single connection scaling (yet). Our autoscaling is designed for traffic that ramps up and VMs that can handle ~25 concurrent requests. The new autoscaling is quicker to respond but still not fast enough for what you have configured.

viraj · September 2, 2021, 7:37pm

Hi @kurt, stumbled upon this discussion while facing the same problem as @matthewrobertbell. However, we have soft limits at 20 and hard limits at 25. Been sending multiple requests (60+) simultaneously and the servers don’t seem to scale up.

PS: Running tests with k6 and have consistently kept requests above 25 for over 10 mins, still no luck scaling

kurt · September 2, 2021, 9:56pm

Hi @viraj! There might be a few reasons for this:

We just shipped (like today) a new autoscaler that is much more responsive. I opted your apps into it.
Your apps are technically using connection concurrency. If you’re sending multiple requests over shared connections with k6s, it won’t affect the scaling numbers.

For #2, I would suggest adding type = "requests" to the concurrency block in your fly.toml.

Will you run your test again and see if those things improve? With new autoscaling you should see fly status show new instances within about 30s of a test starting.

viraj · September 3, 2021, 5:52am

Hi @kurt, thanks for getting back. Redeployed with type="requests" and ran the tests again.

I’m able to see the hard limit warning in logs. However, the server doesn’t seem to autoscale

PS: This is the concurrency config which I’m using

kurt · September 3, 2021, 2:15pm

Ok I looked at the metrics and this seems to be a bug scaling from one to two.

It peaked at 25 concurrent requests. This was because it was hitting the hard limit, most likely.
The scaling metric actually said “hey, we need 1.25 VMs for this load”. We obviously can’t add 0.25 VMs so it didn’t add any.

I just tweaked your app to use a ceil function there. This should help if you want to test again.

viraj · September 6, 2021, 5:08am

Hey @kurt that did help. Most of the tests performed afterwards had better results and lower error rates. Couple of follow up questions

Can we define parameters based on which scaling takes place?
When we release new apps, will the new scaling be automatically applied to that too?

kurt · September 6, 2021, 3:46pm

There aren’t any scaling parameters to define yet. We’d like to let you specify a metrics query to scale on, but as you can see from our ceil issue that’s actually kind of hard!

That ceil bug is fixed, all new apps should behave the same as yours now.

viraj · September 7, 2021, 9:29am

Thanks for sorting out the bug.
Re metrics query, what I am looking for is to autoscale based on cpu and/or memory loads. Say if loads are above 80% scale up and vice versa.
One concern with the current implementation I have is that a VM takes a long time to spin up and during that period some of the requests error out. Ideally, I imagine that could be avoided by defining thresholds for scaling up servers.

jerome · September 7, 2021, 12:14pm

We have plans that make VMs boot a lot faster. We’re getting there and have already started testing this new scheduler with the new remote builders.

Topic		Replies	Views
autoscale max instances	8	728	October 12, 2021
Autoscaling is constantly stopping and starting instances even with absurdly high soft_limit of 100k Questions / Help metrics , autoscaling	3	79	February 7, 2025
autoscale not scaling	16	1150	November 18, 2022
Issue with Autoscaling Based on Request Count in Fly.io autoscaling , proxy	5	89	October 27, 2024
Finding good [services.concurrency] settings without bringing down prod ;-) Questions / Help	3	1202	September 15, 2023

Autoscale doesn't seem to work with hard_limit = 1 and soft_limit = 1

Related topics