Currently, I have a total of 4 machines (8x–16GB), and my toml file is configured as below.
[http_service]
internal_port = 3003
force_https = true
auto_stop_machines = ‘suspend’
auto_start_machines = true
min_machines_running = 2
processes = [‘app’]
[http_service.concurrency]
type = ‘connections’
soft_limit = 30
[[vm]]
size = ‘performance-8x’
memory = “16384mb”
This means that there will always be 2 machines running continuously, while the other 2 machines will be in a suspended state. When the number of connections reaches 60, the third machine will start, and when it reaches 90 connections, the fourth machine will start.
However, when I check Grafana and look at the App Concurrency chart, the metrics for each machine only stay around 10–15, and all 4 machines are always running. This is not what I expected.
What should I do to make autoscaling work as intended?
Additionally, besides autoscaling based on connections or requests, is there a way to autoscale based on memory or CPU usage on Fly.io?
