As @petercxy covered in his recent post, one of our focus areas right now is fairness through the Fly Proxy. Under this umbrella we have two main goals: Scheduling fairness, and bandwidth fairness.
Both of these are things we have avoided building until they were required. Given a wide distribution of applications, and large enough edge servers, you can go a long time before fairness becomes a problem. Even today this is rarely a problem, except for when it is.
We have been applying bandwidth fairness through the proxy for a couple of weeks now. It’s an internal crate in the proxy we call Airtime, best summed up by the desired behavior:
“When an edge server is nearing the limits of what it can process and remain healthy, the apps responsible for the most throughput are limited enough that all other apps don’t see an impact.”
The other way of framing this is that we have a certain amount of “headroom” up for grabs at any given time. Airtime ensures that an app using that headroom has to slow down when our baseline traffic increases.
How it works
Airtime lives in the Fly Proxy, and reacts when the proxy is pushing more throughput than we’d like. This throughput target isn’t a fixed number – it’s one of the knobs we turn to keep things humming.
The behavior is (intentionally) very easy to conceptualize: when the proxy exceeds its throughput threshold, it sets a single per-app ceiling (the maximum any one app can use). It progressively lowers this ceiling until the total throughput at the edge is happy. Within an app, the exact same logic repeats for connections: a per-connection ceiling is found that keeps the app as a whole within bounds. This is all a big feedback loop, so the specific limits oscillate around a bit before converging on stable-enough values that keep everyone happy.
To achieve this, each connection through the proxy now has a GCRA attached. (That Wikipedia page is a lot, just think: leaky bucket). When Airtime is enabled, connections have to start obeying this rate limit, which is still a no-op in the majority of cases. Since we ratchet down maximums until the edge is happy, only apps above that ceiling are affected. Within those apps, since the same logic applies, only the connections pushing the most throughput are slowed down.
All of that to say, apps will experience more stable networking when the proxy is nearing its limits. Smaller apps won’t get swamped, and larger apps will get steadier connections during heavy traffic. It’s also worth mentioning: this isn’t a normal mode of operation. Airtime is here to deal with bursts of traffic (like a large download). If a region is hitting this with any frequency, it’s a signal for us to resolve that by scaling up our capacity.