Does fly load balancer also route traffic to stopped machines?

Is the load balancer treating stopped machines the same as started machines?

I’d like it to, but I’m not 100% sure it does. Trying to confirm this to ease the debugging process.

Scenario, I’ve got machines A and B:

  • My queue tries to run a job on machine A
  • Machine A is busy running a heavy process and returns a 503
  • Because of the non 2XX status the queue retries, and I’d like it to retry immediately
  • Ideally it would route the retry request to machine B, even if it’s turned off

Can I be guaranteed the load balancer is selecting a different machine for the retry request, even if all the other machines are turned off?

Can you return a response with a Fly-Replay header instead?

1 Like

Yup, probably. Just found out about the replay headers. I think I need to handle the load balancing on my own, the normal load balancer was not meant for this kind of thing.

I need to look up the machine ids before-hand from the API, pick the machines which state is stopped since they’re idle, if none available then pick the oldest machine. And force the machine with fly-force-instance-id: <machine-id>, I think. And let the queue retry the same machine always.

It’s also possible/recommended to offload long-running requests to a worker. Here’s an example with Django and Celery but there are worker/queue components for every framework (celery for python, sidekiq for ruby, oban for elixir/phoenix, bullmq for node, laravel queues… etc). That way your web server doesn’t get held up processing the slow job; it sends it to the worker and will then be immediately available to serve requests again.

  • Daniel

Yeah I got it separated, It’s a different app from the application server. It’s running node and I’m using tinypool to chug two chromes in worker threads.

1 Like