How to prevent contention between background jobs in a multi-machine set-up?

I have a theoretical question. I think I have an answer for it, but I’d be interested in getting other opinions.

I have an emerging architecture design that looks like this:

             APP                  APP                   APP
             ----------------     -----------------     ----------------
             |     web      |     |  distributor  |     |   browser    |
internet --> | (2 machines) | --> | (2 machines)  | --> |  (ephemeral  |
             |              |     |               |     |   machines)  |
             ----------------     -----------------     ----------------

So internet traffic comes into the web machines, and since this is round-robin, either one can receive the traffic. For a particular action, a long-running web-crawling operation is required. So a request is sent to a distributor, and again this is round robin. For both of these cases, the machines within would be in different regions.

The job of the distributor is to create browser machines that do a crawling operation for a few minutes, send the data back to any web instance, which writes it to a managed database. The browser instances have no redundancy.

Now in distributor, the machines will have a background job to poll the database, and if a job request is in a certain state, it creates a browser. However, since each distributor has its own background job, each one could detect the change independently, and create its own browser, when I only want one.

My current thinking is that I need to use the managed database with row locking. However since I am quite new to system design, I’d be interested to hear how others have solved this sort of problem. In my case, I could have one distributor with some stopped hot spares, but is that a design pattern that can easily be achieved on Fly?

1 Like

I solve this sort of problem by not solving it. It’s way too much of a complex problem for my little :brain: to wrap around. I use Temporal to handle all that mess for me… I just write the application logic, eg, the web scraper and let the smarter people handle the complexity.

1 Like

Interesting, thanks; is that these folks? Paradoxically, I fear that my brain would explode learning a new thing, even if their product simplifies contention problems! :exploding_head:

Yes, there’s a bit of learning curve but it’s not so bad. They have lots of great resources/videos to learn.
Once you got the basics, you can try Temporal Fly.io · GitHub

1 Like

[Meta] @mayailurus Thanks for wanting to tag my question. I can leave it be if you prefer it, but I don’t think that’s an appropriate tag; it would point to an interesting answer, but it doesn’t really apply to the question, which is rather more general than that tag would imply.

My preferences carry little weight here, but, generally speaking, the tags don’t have anywhere near the precision or implications that you’re suggesting…

They’re just an assistance to future readers in finding related discussions (not only initial questions). In my view this is significant these days, as search engines get weaker and weaker.

Having said that, I removed temporal, since the small benefit isn’t worth any contention, however minor and good-natured.

1 Like

Alright, fair enough; thanks! I do a lot of editing on Stack Overflow, and I wonder if I might be applying their guidelines here inadvertently :crazy_face:

1 Like

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.