Right now, the proxy only scales is only built to handle the 0 to 1 scaling case. It’s not designed to autoscale across multiple machines. It kinda works, but that’s an unsupported use case at the moment.
When a request or connection comes in, the proxy runs through its load balancer logic, then forwards the user on. Then it ensures the machine is started.
Scaling down is just a matter of exiting, though. If you can teach your process to exit when it’s likely to be idle, you’ll get “scale to zero”. But again, it’s not designed to work with multiple machines.