Can celery and Django sit on the same machine?

Hi
I have a working web application on Django. It is using

  • app for DB
  • app that runs 3 machines
    ** Django
    ** celery
    ** beat for celery

the application was created / coded by somone else for me. And I always want to understand how IT works behind…

My question is if instead of 3 machines I can be using only 2 or even 1. And why.

Thank you
Radek

someone would know … ?

Hi!

You can definitely run all services on a single machine. You’d have to “hack” things a bit, by creating a small shell script that starts your services like so, in daemon mode or in the background

#!/bin/bash
gunicorn the_django_app -D  [other parameters]
celery worker -D [other parameters]
celery beat --detach [other parameters]
sleep infinity

And then use this script in your Dockerfile’s CMD instead of calling e.g. gunicorn directly.

That said, it’s definitely recommended and best practice to run them in separate machines. There are several reasons:

  1. Dedicated resources. If your workers are overworked (heh) they will not interfere with your web service, and viceversa.
  2. Assymetrical vertical scaling (wow that sounded fancy but it’s really not). If the web server needs a lot of memory but the workers do not, you can scale independently; with the single-machine approach you have less flexibility in this respect. The Celery Beat machine typically can get by with little resources since it’s just queueing up jobs for the workers at intervals.
  3. Horizontal scaling per workload. If your service mainly processes background tasks but doesn’t see a lot of web requests, you can scale the worker process group independently as much as you’d like while keeping only a few web servers. This works the other way too: if you’re mostly serving web requests and have the occasional long-running process, you can scale to many web servers but keep only a few worker nodes.
  4. Single celery beat instance. This is actually important because if you have several beat instances you might end up with duplicated tasks from them. A single beat instance schedules the jobs, and multiple workers take them from the queue for processing.

I hope this rant is useful :slight_smile:

  • Daniel
1 Like

hi Daniel

great, great. Thank you so much for your explanation. I know now how to do it and why not to do it too.

Any chance you can help me with these two questions of mine? Especilly the first one is quite important to me.

Thank you in advance
Radek