We’ve introduced a new feature to automatically start/stop instances. When enabled, the proxy will scale instances of your app up/down as demand changes.
This feature can be enabled in the services
section of the fly.toml
[[services]]
# automatically start machines
auto_start_machines = true
# automatically stop machines
auto_stop_machines = true
...
...
internal_port = 8080
protocol = "tcp
Similarly if you’re using http_service
[http_service]
# automatically start machines
auto_start_machines = true
# automatically stop machines
auto_stop_machines = true
...
...
internal_port = 8080
protocol = "tcp
Default settings
New apps have both automatic starting and automatic stopping enabled by default
auto_start_machines = true
auto_stop_machines = true
Existing applications are automatically started but not automatically stopped.
auto_start_machines = true
auto_stop_machines = false
When should I use it?
This feature is useful if you have highly variable workloads. Your instances will be able to start/stop automatically as demand increases/decreases. The central benefit to doing this is cost reduction. Instead of having to run excess instances to handle peak load, only what is necessary is running at any given point in time, saving you your hard-earned .
This feature is slightly different from what is typical in autoscaling in that we don’t create instances for you up to a specified maximum. It will automatically start existing instances. If you want to have 10 instances available to start to service requests, you need to create 10 instances of your app.
Recommended use
It is recommended to set both settings to the same value. If auto_start_machines
is enabled but auto_stop_machines
is disabled, the proxy will start your instances but they will never be stopped. This is fine for cases where you want to manually stop instances but if not, your instances will be left running indefinitely (and cost you your hard-earned money!).
If auto_start_machines
is disabled but auto_stop_machines
is enabled, the proxy will scale your instances down but will not be able to start them again. If all of your instances are scaled down, requests will start failing.
When not to use it
At the moment, we don’t support specifying a minimum number of running machines. Apps will scale down to zero if auto_stop_machines
is enabled and there’s no traffic. If you need your application to be “always on”, disable this setting.
How does it work?
The settings auto_start_machines
and auto_stop_machines
instruct our internal Fly Proxy to automatically start/stop instances of your app (which are Fly Machines).
Autostart
If auto_start_machine
is enabled, it will automatically start instances as follows:
- A new request is made to your application
- All the running instances are above their soft limit
- If there are stopped instances, the proxy will pick one from the nearest region and start it
- The request will then be sent to the started instance
Autostop
If auto_stop_machines
is enabled, it will automatically stop instances as follows:
- The proxy looks at all instances of your app in a given region e.g
fra
- It finds out how many of these instances are above and below their soft limit
- If there is more than one instance in the region, it calculates whether there is excess capacity in the region using the formula:
excess instances = num of instances - (num of instances over soft limit + 1)
. For example, if we had 9 instances and 4 of them were over their soft limit, the excess capacity is:excess = 9 - (4 + 1) = 4
, meaning we have 4 additional instances than we need to service the current traffic we have. - This algorithm is based on the assumption that you’d need one more instance than the number of instances over their soft limit i.e we need 5 instances running if 4 are over their soft limit. We’ll be monitoring how this plays out in production settings and adjust it if necessary.
- If there is more than one instance in the region, it calculates whether there is excess capacity in the region using the formula:
- If there is excess capacity, one instance is stopped.
- If there is only one instance in the region, the proxy checks if it has any load. If its load is 0 (i.e there is no traffic to your instance), then it is stopped.
- This process runs every few minutes. If there are a number of instances that can be stopped, it’ll occur over a period of time as only one instance is stopped per iteration of this process.
Again, I’d like to point out that this downscaling process happens on a regional basis. If you have traffic in ams
but not in fra
, your fra
instances will be stopped but your ams
instances will remain running (subject to the formula).
Feedback
If you have feedback, comments and questions, please share!