I would use a different architecture. I use an app, which I call a Distributor, which is small and always on (I use a pair of machines here, but you can use one machine for simplicity if you want). This app receives requests from the web, and then creates new machines using the API, which do work and then exit when they finish.
I would say that, in my opinion, it is a good general approach to architecture. Whether Fly has special features in its networking/auto-scaling that offer a quicker/better approach, I could not say.
At one stage I think Fly was offering Fly-related architectural advice, but I don’t know if they still do that.
Hi @056xyz. What’s happening here is that, when using the auto-scaler in “create new machines when needed, destroy them when not needed” mode (that is, defining FAS_CREATED_MACHINE_COUNT), the destroy machine operation uses the equivalent of --force=true. This means it brutally kills the machine, no questions asked, without honoring kill_timeout (which I notice you’ve set to a large value to allow your workers to complete their tasks).
What you could do is use FAS_STARTED_MACHINE_COUNT instead; manually create the maximum number of machines you think you’ll need, and let the auto-scaler stop and start them instead of creating and destroying them.
When operating in this mode, the machines are stopped in the usual, graceful shutdown way:
your defined kill_signal is sent to the machine. This should signal your workers to finish what they’re doing and not pick up any more work.
If your app hasn’t exited, wait_timeout seconds later, SIGTERM is sent and the machine is forcibly stopped at this point.