I’ve noticed that one app VM might get reused to service several requests sequentially. Will one VM ever be used to service multiple requests concurrently? If so, is there a way to prevent this?
In my scenario there may unavoidable global state (e.g., environment vars, files, node modules, …) that needs to be different for each request. If one VM handles multiple requests serially then I may have a chance to clean up between requests. If they are handled concurrently then I don’t see a way to isolate them.
Seems like Machines are the real answer here but I have yet to be able to get the simple Node example working with Machines (I get it running but then can’t talk to it as per Tease us with more "machine" info? - #8 by jeffmcaffer). In the meantime, I’m looking to see if Apps can at least enable some prototyping.
Thanks @greg. I optimistically confused that doc with the way AWS Lambda works where they run requests sequentially on a given VM. Since I’m running arbitrary user code I need to have a fresh environment for each execution. May just have to wait for Machines…
Machines will be the real answer here, they’re just not finished.
App VMs could potentially be used this way. You can set a hard_limit of 1 in the fly.toml for the service. Then exit the VM when the request finishes. It’s a little hacky, but that’s close to what you’ll be able to do with machines.
Thanks @kurt. Just to clarify, when you say “exit the VM” you mean exit the process that was defined as the entry point for the app (e.g., process.exit(0) in index.js for a Node app)? Or is there a flyctl command that needs to be run?
OK that worked. Thanks. It seems to take quite a while for the app to come back up after a process exit so while the first request completes in 100ms, a second, fired at the same time as the first, takes ~14sec to complete. Watching the logs it looks like the VM is up and running but waits quite a while for the health checks. I dropped down the numbers in services.tcp_checks but it didn’t get much better.
I saw mention of max-per-region in the TOML doc so added the following. It seemed to help two concurrent requests the first time but there after it was about 20sec cycle time.