When to upgrade apps?

What’s your stack? Usually, one profiles their app and know before hand just how many requests the app can handle given a particular hardware type (“load testing”). See this: Autoscaling is not triggered on a pure websocket application - #2 by eli Note though, some requests (say, full table scans) may be more demanding than other requests (say, point queries).

Either one is left with a choice to scale up (better cpu, more ram, lighter container, faster runtime, optimised code etc) or scale out (more servers). There’s also possibility to shed load than server it, or shape away traffic from busy servers. None of this is easier to accomplish. Though, when Fly does support autoscale for Machines, I’d imagine it would make at least the scale out part of the equation much simpler.