Instance is downscaled while worker is running

jivanovic · August 21, 2023, 4:47pm

Hello Fly Community! I am using fly.io to deploy a nodeJS application. At the moment I am running a performance-2x machine with the autoscaling option (scale up/down). In my nodeJS app I use bullmq with a redis connection to handle the job orchestration. There a job that I have implemented, which downloads an S3 video file and transcodes it with the help of fluent-ffmpeg.

When an instance in which a job is in progress does not receive traffic for up to 2-5 minutes, it gets downscaled I get why is that and how fly.io handles things, but I am wondering if there is a way in which the instance would not be downscaled while a job is still running while still having the autoscale option on? I tried to set a higher kill_timeout in fly.toml however I can only set it to max 5 minutes.

Would be happy to hear your recommendations!

kd1 · August 21, 2023, 5:30pm

Hi @jivanovic, there’s a couple options:

[preferred] Make it so that your app doesn’t need to be up at the end of each job.
Add a request to the app that wakes it at the conclusion of the job, and block until the app is up again.
Send a request every 5 minutes from the worker as it is processing the job. This would require you to make an endpoint in your app that keeps the process running every time the endpoint is hit, like a heartbeat.

For option 1, can you save the output of the video transcoding to S3 / do any of the post-processing within the job itself instead of in the main app? Can you track the status of jobs in a SQLite table instead of on the main app?

jivanovic · August 21, 2023, 6:14pm

Hey @kd1! Thank you for the response, I really appreciate it

I agree with you that Option 1 is the preferred one and I have it in the roadmap to refactor this part of the app logic. However at the moment I am looking for a bit simpler solution so I will go with Option 3 and make a heartbeat request with the fly-force-instance-id key in the header to target the specific app instance that is running the job to keep it alive

I read about how fly.io checks and decides when to downscale or upscale instances but I never saw a specific interval value when these are made. Does 5 minutes go by or is it something we cannot really predict?

kd1 · August 21, 2023, 7:00pm

I got 5 minutes from your kill timeout. You’d want the frequency of the heartbeat to be less than the timeout so that the timeout is never reached.

jivanovic · August 22, 2023, 6:40am

Yeah, of course. Thank you for your responses @kd1 have a great day

system · August 29, 2023, 6:40am

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Machine downscaled even if a process is running JavaScript machines	1	359	December 21, 2023
Scaling machines connected to tailnet autoscaling	2	14	June 6, 2025
Fly autoscale downscaling Questions / Help docs	10	1427	September 22, 2022
How to have one request/job per machine? Questions / Help autoscaling , temporal	5	127	October 9, 2024
Queue/Worker architecture with Autostop/autostart Machines? autoscaling	16	393	October 15, 2024

Instance is downscaled while worker is running

Related topics