Fly Proxy routing most traffic to a single machine with low concurrency CPU-bound requests

I think that would be the best way, but probably not the only one. These huge batch requests sound like an awkward fit for the shared CPU class, overall, to be honest…

Since you know in advance which endpoints are CPU-intensive, you could use @lillian’s idea from someone else’s thread and create two performance-class Machines in a distinct process group. Then Nginx could unconditionally proxy the CPU-intensive requests to those heftier Machines, which would otherwise be sleeping the rest of the day.

(You would want two since one of them will disappear someday, during a hardware failure.)

The process group would need its own [[services]] section, with its own port, distinct from port 80.

Yes, as @PeterCxy said last time, that is what the Fly Proxy should be doing on its own. I do see a roughly 50/50 split in my own experiments with two Machines over Flycast, although it’s not a strict A-B-A-B-A-B alternation, :proxy_robin:.

Here’s what I used for testing:

$ fly ssh console
# curl -is -H 'flyio-debug: doit' 'http://app-name.flycast/metrics?q=[0-19]'

The significance of /metrics is that it’s harmless (on the server side) and has a small response, and the ?q= part is ignored by the server. That just causes curl to make twenty consecutive requests, via its little-known “globbing” feature. The Machine that produced each response can be seen in the sid field of the flyio-debug response header.

I’d suggest giving that a try yourself, to rule out the possibility that it’s something about your Nginx apparatus confounding things…