Hello! as you can see in the image, I never reach the 4GB limit, and yet my app fails telling me the machine ran out of resources. For context, I’m trying to deploy a snowstorm lite instance to perform HTTP request on a clinical dataset.
I should even be okay with a 2GB limit, as it never uses that much locally, but for some reason the logs say:
2024-07-03T22:41:25.496 proxy[3d8d170b6dee89] scl [info] App snowstorm-local has excess capacity, auto stopping machine 3d8d170b6dee89. 1 out of 2 machines left running (region=scl, process group=app)
I would appreciate some guidance on how to debug or solve this issue.
That’s not from your machine running out of memory, that’s because you have excess capacity (more resource than you need) and it automatically shuts 1 off. If you want 2 machines up, set min_machines_running = 2.
You should reduce it to 2GB though.
Regardless of the memory limit (I have already tried 2GB), BOTH machines shut off because of excess capacity instead of only one of them, so I end up with no working apps.
No, you are right, it doesn’t say it runs out of resources, it just states that the app has excess capacity and auto stops. The problem is, this happens on both machines, even if one of them already stopped the other one goes on for a while and then stops too, so the app never completely runs.
@guillermo-st your fly.toml has min_machines_running = 0 which means that the minimum number of machines that will be kept running when idle is 0. This will cause both machines to be stopped when idle (roughly 5 minutes without any incoming requests).
Is Snowstorm a web server? From what I gather, it’s a RESTful API with regards to FHIR data. When you say never completely runs, does that mean that one of the API has a long running task that doesn’t complete before fly kills it due to lack of HTTP activity?
When first ran, snowstorm needs to build an index of said FHIR data. It pulls that data from a repository and then builds the index. From that point forwards, it will accept HTTP requests and reply with the matching clinical data.
The problem is, it’s dying before it finishes building the index
You need to mount a volume for that data, otherwise you’ll have to keep building it each time it shuts down/boots up. Does snowstorm have a health api? Set up a health check on that endpoint so that fly can properly set your app in a ready state once it’s actually ready.