We now support setting a minimum number of machines to keep running when using the automatic start/stop feature for Apps v2. This will prevent the specified number of machines from being stopped. Update your flyctl to the latest version and then in your fly.toml
If instances of your application take a while to start and that is unacceptable for your use case, you will benefit having at least 1 instance always running (min_machines_running = 1). When a new request comes in, instead of having to wait for the app to start up in the case it was scaled down completely (i.e the cold start problem), it is able to respond immediately.
What you need to know
The most important is that we only keep instances running in the primary region of your app. All other regions will still get scaled down to 0. As an example, if min_machines_running = 3, then you’ll need 3+ instances in your primary region.
Some other things to know:
The max number of machines we can scale up to is implicitly defined by the number of machines your app has. We will scale your app all the way up if the demand requires it and scale back down to the minimum specified
The default minimum is set to 0
* This does not solve the cold start problem entirely. When a request comes in and the proxy decides to start a new instance, that request waits for the new instance to start. We don’t start a new instance while servicing the current request with an already running instance. So while you may not run into a cold start for your first instance, if we start a second one, that request will run into it. We’re giving some thought to how to solve this and as always, will post on here once we’ve got a solution for you
I have two machines (I cloned one of them just recently).
According to the monitoring and the logs, both machines got scaled down:
2023-05-12T19:05:56.552 proxy [6e82d956a79408] ams [info] Downscaling app peter-kuhmann-website in region ams. Automatically stopping machine 6e82d956a79408. 2 instances are running, 0 are at soft limit, we only need 1 running
2023-05-12T19:05:56.558 app[6e82d956a79408] ams [info] Sending signal SIGINT to main child process w/ PID 513
2023-05-12T19:05:56.746 app[6e82d956a79408] ams [info] Starting clean up.
2023-05-12T19:05:57.746 app[6e82d956a79408] ams [info] [ 405.553727] reboot: Restarting system
2023-05-12T19:07:18.119 proxy [5683d920b1618e] ams [info] Downscaling app peter-kuhmann-website in region ams. Automatically stopping machine 5683d920b1618e. 1 instance is running but has no load
2023-05-12T19:07:18.122 app[5683d920b1618e] ams [info] Sending signal SIGINT to main child process w/ PID 513
2023-05-12T19:07:18.628 app[5683d920b1618e] ams [info] Starting clean up.
2023-05-12T19:07:19.630 app[5683d920b1618e] ams [info] [ 503.747677] reboot: Restarting system
Interesting: 2 instances are running, 0 are at soft limit, we only need 1 running on first downscale it seems to “know” the min setting.
But the second check doesn’t seem to take it into account: 1 instance is running but has no load.
Did I miss a specific configuration or precondition?
I looked at your app and both the instances of your machine are running in the region iad. However, your primary_region is set to ewr. Autostop only keeps machines running in the primary region of your application. Did you do anything that caused your machines to deploy in iad? If not, then its an issue on our side.
Hi Senyo, thanks for helping look into this. I destroyed the iad machines and deployed from my CI build automation, and scaling appears to be working great now. My CI deployment scripts are all configured to deploy to ewr.
What I think happened is when I was setting things up originally and running commands manually on my desktop a several weeks ago to debug various issues, I must have made a typo once and deployed a machine to iad by accident. After that the machine stuck around, I didn’t notice the region was wrong since my deployments were being done with the CI automation which didn’t blow away the extra iad machines.
So thank you for helping identify the issue and pointing it out.
To reduce this type of error in the future, I wonder if there is a way to have the toml file be more explicit about the final state of the deployment so that it could be a single source of truth?
I have a different issue that I’m trying to fix.
I’ve googled, read the docs and the threads here, but without success.
I have a staging environment to test my application, where I want to stop all machines when there is no load (for cost reduction).
I have the .toml file below. It sets the min_machines_running = 0 and destroys idle machines. However, since this staging environment is used by only 1 or 2 devs (usually not even simultaneously), machines seem to be destroyed even before I can send a second request to my http service. It seems there is a too short idle delay before shutting down the machines. For apps in a production environment with actually concurrent users, that may work fine: when there is no requests, machines can be immediately destroyed. But for staging environments, a minimum idle timeout should happen before destroying machines.
Sometimes I log in and, when I try to click some link in the returned web page, I get disconnected afterwards. My app uses session authentication, so it seems the VMs are destroyed and the session is cleared.
I’ve tried to change type from “requests” to “connections” but without success.