min idle time before scaling down/killing an instance

Hello, I’m just trying to understand why this machine was scaled down

From the logs, it is clear that this instance served a request, and within 7 seconds, it was scaled down (and killed).

How do I control the minimum idle time before a machine should be killed because of scaling down?

2024-01-05T12:09:33.156 app[1781964f067089] yyz [info] Created data in index: 72

2024-01-05T12:09:33.164 app[1781964f067089] yyz [info] --> POST /collections/c_kqLkFNqGisjKx5ld/vectors/create/text 200 8s

2024-01-05T12:09:39.465 app[1781964f067089] yyz [info] <-- GET /health

2024-01-05T12:09:39.466 app[1781964f067089] yyz [info] --> GET /health 200 1ms

2024-01-05T12:09:40.185 proxy[1781964f067089] yyz [info] Downscaling app XXX from 3 machines to 2 machines, stopping machine 1781964f067089 (region=yyz, process group=app)

hi @rajatkum

The auto start and stop process runs every few minutes, and right now there is no way to set a min/max idle time or to control which Machine in a group of Machines (in the same region) gets stopped. The proxy just considers the soft_limit setting and current traffic when stopping a Machine.

You could have your app shut itself down when idle, in which case you’d be able to control the idle time before exiting. Or you could keep at least one Machine running in the primary region (set min_machines_running=1) to avoid any “cold starts” after the app has a period of zero requests.

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.