Configuring fly.toml to auto-start/stop celery workers

I have a Django app that performs some image processing using worker machines that are running celery. The image processing happens rarely enough that I want the worker machines to stop automatically along with the main machine. I have this sort of working, but I’m getting some error messages from the proxy that I’m not listening on the correct address, and the worker machines start and stop along with the app machine regardless of whether there are any worker tasks.

Here is part of my fly.toml that is leading to this behavior:

[processes]
  app = "python -m gunicorn --bind :8000 --workers 1 --timeout 300 my_project.wsgi"
  worker = "python -m celery -A my_project worker -l info"

[env]
  PORT = '8000'

[http_service]
  internal_port = 8000
  force_https = true
  auto_stop_machines = true
  auto_start_machines = true
  min_machines_running = 0
  processes = ['app', 'worker']

I got the worker to start and stop by adding 'worker' to the processes above. My app seems to function correctly if I set --workers 1 in my gunicorn command; if I set --workers 2 and there are multiple tasks, some of them get run and some seem to go into the void.

I am getting the following log messages that indicate that something is not hooked up right, regardless of the number of workers I have:

2024-07-19T17:29:05Z proxy[48ed64eb717578] sjc [error][PC01] instance refused connection. is your app listening on 0.0.0.0:8000? make sure it is not only listening on 127.0.0.1 (hint: look at your startup logs, servers often print the address they are listening on)
  • I have tried adding --bind 0.0.0.0:8000 to the gunicorn command above.
  • I have tried adding a separate [[services]] section to my fly.toml that’s specific to processes = ['worker'], but I don’t know if I’m doing it correctly.
  • I’ve tried using --bind 0.0.0.0:8080 (different port number) and using 8080 in the [[services]] section, but this doesn’t seem to help.

This topic indicates that stopping celery workers can’t be done automatically (at least not easily, and not through fly.toml), but I am disinclined to believe it because the configuration I have auto starts/stops the worker along with the app. (However, the worker machine starts when the app starts, even if there are no tasks.)

In short, my goal is to run these infrequent but process/memory intensive tasks on a separate machine, without having to pay for that machine to be running at all times. I’d appreciate any help specifically tuning my [[services]] section, or general advice if I should be taking a different approach.

fly.toml doesn’t have any config to auto start/stop celery workers. It can only control things the proxy interacts with, which means network services.

For worker type workloads, you’ll want the autoscaler: Autoscale based on metrics · Fly Docs

This is slightly harder to use because it requires you to export a metric from your app that the autoscaler can work off of, but it’s really nice once you get it setup.

Here’s what you’ll need:

1 Like

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.