How to deploy new version without interrupting background jobs?

Hi Fly Community,

My Fly.io service uses an attached volume and processes 1-hour background jobs. Here’s the flow:

  1. Tasks originate from another service and are sent to Google Cloud Tasks (for reliable queueing/retries).
  2. Cloud Tasks pushes tasks to an HTTP endpoint on my Fly.io service. The service then adds these to an in-memory queue.
  3. After processing all tasks from its in-memory queue, the instance self-terminates (exit(0)).

How can I deploy new versions without interrupting an active 1-hour job on an old instance? This is crucial as the job duration significantly exceeds Fly.io’s typical kill_timeout.

Specifically:

  • How can an old instance be allowed to finish its current hour-long job (and the rest of its in-memory queue) before being terminated by a deployment?
  • Can its shutdown be deferred until all its current work is complete?
  • What are Fly.io best practices for this usecase?

Appreciate any guidance on smoothly deploying these long-running task workers.

Thanks!

If the machine terminates when its jobs are finished, doesn’t that solve the problem? You can just push the new image to the Fly registry, and when a new machine is required, it will be upgraded automatically.

I don’t know if you can do a broad deploy and not have it kill the existing instances. You could specifically target each machine by id and do some orchestration but that’s a tedious manual process.

I’m not familiar w/ GCT but from what you’ve described, it doesn’t sound like it’s design to handle idempotent tasks well since it pushes the work into an in memory queue in your worker.

DISCLAIMER: I’m not affiliated w/ Temporal io
but that tech has been great for use cases like this - where each “subtask” is broken up and its state is stored in a DB so when a worker instance dies for w/e reason, it can resume where it left off.

True, a deploy would not work here. But a local image build and a push would likely work; I assume this would help:

(I don’t mean the API bit, but the registry auth stuff. Maybe there is an image push in the flyctl CLI too, I don’t know.)

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.