Autoscale doesn't seem to launch new instances

Hi, I’m currently trying the autoscale feature, I’ve launched an app in 2 regions, and configured autoscale with fly autoscale standard min=4 max=6.
The command fly autoscale show returns expected values:

     Scale Mode: Standard
      Min Count: 4
      Max Count: 6

I tried a load test with 200 concurrent connections, and the logs shows many lines like this one

2021-09-16T14:42:45.736960969Z proxy[523c6e85] lax [warn] Instance reached connections hard limit of 25

but the app sticks at 4 instances (I’m monitoring it with fly status --watch)

any idea why it does not create the needed instances since it reaches the connections hard limit ?

Thank you for your support :slight_smile:

Another remark, disable doesn’t seem to do what’s supposed to do…

mad@dev ~/c/e/q > fly autoscale show
     Scale Mode: Standard
      Min Count: 4
      Max Count: 6
mad@dev ~/c/e/q > fly autoscale disable
     Scale Mode: Disabled
mad@dev ~/c/e/q > fly autoscale show
     Scale Mode: Standard
      Min Count: 4
      Max Count: 6

Setting the scale count manually disables the autoscaling feature, but setting it back doesn’t seem to add new instances.

How long are you running your test for? If you visit the UI (fly.io/apps/<name>/metrics) you can see the concurrency per VM. Scaling is based on metrics, so it’ll need to be maxed out for like 30-60s before you see a response.

Also what are your limits set to?

It was for more than 1min, I’ll try again right now

Edit: Seems to works! I just had to wait a longer time :slight_smile:
Is there a way to adjust time before booting a new instance ? or maybe detecting cpu usage to trigger that ?

Default 20 soft, 25 hard

There’s not yet! We’re redoing our orchestration layer so we can boot new VMs without having to do metrics queries. The lag is not great, we’d like to be able to spin one up immediately and hand a request off to it. This is a big project, though, so it’ll take a few months to get there.

We’d also like to enable scaling with things like CPU, but those suffer from the same lag. It might be 15-30s before we “see” a CPU spike. In practice the metrics based scaling works ok, but as you found from stress testing, it’s not as responsive as it should be.

2 Likes

I guess scaling triggered by CPU or even memory will be easier to manage for users, I’ll try lowering the default values and do some load tests.

Thank you for your support,

1 Like

Are autoscale instances shown with fly status --watchcommand ?

The metrics page (vm concurrency) seems ok

but the status command shows only the initial 2 instances