Maybe two minutes is not long enough for autoscaling to kick in?
If you haven’t already, I’d definitely suggest trying for a bit longer than two minutes as a quick troubleshooting step.
Generally speaking, the auto-scaling feature is a definitely bit slow to react, and difficult to induce via loadtest. For example, it requires that the concurrent connections limit be consistently exceeded for all an app’s VMs for over a minute while the app is running. Platform-wise, there are a lot of moving parts which add latency, too.
We’re looking to improve autoscaling, so a major overhaul is in the works for (hopefully!) sometime later this year. It’s in its early stages, but you may want to check out this thread to follow our work on providing more VM-level features!