I’m working on a flask + celery application and trying to troubleshoot problems with the Celery worker. Everything checks out initially when I launch the app with the following fly.toml:
app = "myapp"
primary_region = "sjc"
[processes]
web = "gunicorn -b 0.0.0.0:8080 --worker-class eventlet -w 6 manage:app"
worker = "celery --app tasks.async_celery_tasks.celery worker -c 4 --loglevel=info"
[http_service]
processes = ["web" ]
internal_port = 8080
force_https = true
auto_stop_machines = true
auto_start_machines = true
[checks]
[checks.alive]
type = "tcp"
interval = "15s"
timeout = "2s"
grace_period = "5s"
processes = ["web"]
After some time, fly will automatically stop the worker. Then when new requests come in, I’m unable to automatically start the worker again and my tasks never get kicked off.
Two questions:
-
What is the best way to start / stop celery workers running as their own process (similar to the way fly manages this for a web server)? I see this option for automatically starting python workers independently of Celery, but it was unclear to me how you would do that at scale for multiple workers and also balance compute. I’d also like to be able to use Celery rather than roll my own task worker.
-
Is there anyway to set up health checks for celery workers as well? Or do I need to bake the celery worker with its own web server to satisfy the health checks? That feels a bit redundant, but I guess it might let me scale the workers independently of the main web server?