Keep a non-service (worker) machine always running

How can I configure a non-service process to always be running?

I have a Rails app with two process groups, app and overmind. There are 2 app machines and 1 overmind machine.

When running fly deploy, the overmind machine stops and does not restart. How do I configure the overmind machine to always be started?

Things I have tried:

  • fly machine start [machine_id] - starts the machine, but the next fly deploy stops it
  • fly machine update [machine_id] --restart always - does not have any effect

Here is the fly.toml:

app = "rhr-web-staging"
primary_region = "bos"
console_command = "/rails/bin/rails console"

[build]

[deploy]
release_command = "./bin/release"

[http_service]
internal_port = 3000
force_https = true
auto_stop_machines = false
auto_start_machines = true
min_machines_running = 0
processes = ["app"]

[[http_service.checks]]
grace_period = "10s"
interval = "30s"
method = "GET"
timeout = "5s"
path = "/health"

[[vm]]
cpu_kind = "shared"
cpus = 1
memory_mb = 2048

[[statics]]
guest_path = "/rails/public"
url_prefix = "/"

[processes]
app = "bin/rails server"
overmind = "overmind start -f /rails/Procfile"

And the event logs from fly m status that show it being stopped:

Event Logs
STATE   EVENT   SOURCE  TIMESTAMP                       INFO 
stopped update  flyd    2024-01-19T11:25:39.762-05:00
created launch  user    2024-01-19T11:25:34.9-05:00  

Add a services block for overmind, something like this:

[[services]]
processes = ["overmind"]
internal_port = ...
protocol = ...
auto_stop_machines = true
auto_start_machines = true
min_machines_running = 1
[[services.ports]]
handlers = ["tls", "http"] # or whatever it is for you
port = ...

Note the configs auto_stop_machines,auto_start_machines, and min_machines_running will automatically start/stop machines, but ensure the number of running machines is at least 1.

You can create a services block for each process you have, so you could also move the app config over to its own block too:

[[services]]
processes = ["app"]
internal_port = 3000
protocol = "tcp"
auto_stop_machines = true
auto_start_machines = true
min_machines_running = 1
[[services.ports]]
handlers = ["tls", "http"] # or whatever it is for you
port = 3000

I don’t think that solution makes sense. overmind is a background worker that manages processes using a Procfile, like in this example.

It’s my understanding that creating a [[services]] entry for it would expose it to the outside world, which is not desired. It doesn’t even have any internal ports AFAIK.

Hi @schrockwell,

I think your worker machine might be configured as a standby machine.

There’s a good explanation of why and how this happened, and how to fix it, here.

  • Daniel
1 Like

Hi @roadmr, that was it! Thank you!

When I did the initial deploy, it created 2 instances, and when I scaled it down to 1, I guess it decided to scale down the primary instead of the standby? Seems like an odd choice.

Anyway, thanks again.

I’m just speculating about the logic, but it doesn’t seem so odd when you think about it this way:

  1. Create two machines, in order. First one is primary, second is standby.
  2. Get asked to destroy one machine. The first one (oldest) is chosen. It happened to be the primary…

But it wasn’t chosen because it was the primary; rather, it was the primary because it was the first machine created (and it’s sadly also the first one to be scaled down).

Anyway, glad it worked for you!

  • Daniel
1 Like

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.