How to enable autoscaling?

Thanks for your reply!

The info you asked for:

$ fly autoscale show
     Scale Mode: Balanced
      Min Count: 1
      Max Count: 8
$ fly scale show
VM Resources for uptime-monitor-backend
        VM Size: shared-cpu-1x
      VM Memory: 256 MB
          Count: 1
 Max Per Region: Not set
$ fly releases
VERSION STABLE  TYPE            STATUS          DESCRIPTION                     USER                    DATE                 
v10     true    scale           succeeded       Update autoscaling config       <REDACTED>           5h42m ago           
v9      false   scale           failed          Scale VM count: ["app, 1"]      <REDACTED>           5h42m ago           
v8      true    scale           succeeded       Update autoscaling config       <REDACTED>           5h48m ago           
v7      false   scale           cancelled       updating region configuration                           5h49m ago           
v6      false   scale           cancelled       updating region configuration                           5h49m ago           
v5      false   scale           cancelled       updating region configuration                           5h49m ago           
v4      true    scale           succeeded       Update autoscaling config       <REDACTED>           5h54m ago           
v3      true    release         succeeded       Deploy image                    <REDACTED>           5h57m ago           
v2      true    rollback        succeeded       Reverting to version 0                                  6h14m ago           
v1      false   release         failed          Deploy image                    <REDACTED>           6h15m ago           
v0      true    release         succeeded       Deploy image                    <REDACTED>       2022-02-06T19:20:07Z

For burn, I’ve most recently used ./burn -c 400 -d 120s https://uptime-monitor-backend.fly.dev/ --verbose --resume-tls.
Maybe two minutes is not long enough for autoscaling to kick in?

I have also recently (on another app, but with a similar setup) tried to use locust, scaling the number of users up to 600 by adding 2 more per second (e.g. reaching the peak after ten minutes, although the soft_limit would be passed after the first ten seconds), with similarly no autoscaling happening.

I have tried both type = "connections" and type = "requests".


The topic you pointed to (this one) did not actually seem to mention anywhere how much time is taken for autoscaling to kick in.
I did find this post by @kurt mentioning “it needs to be maxed out 30-60s”. But maybe that information is outdated?