Our fly-replay feature has two new fields you can define, to help with the cases where a replay cannot succeed. This might happen if, say, the app you’re replaying to is unhealthy or unavailable.
By default, like most requests, a replay will try for a while before throwing a generic 5xx error back to the client. This isn’t always the best course of action, so we now have a couple of new replay fields you can use to tweak this behavior: replay timeout and fallback.
Replay Timeout
The new timeout field sets how long the proxy should try to reach the replay target before giving up. It accepts duration strings like 10s or 800ms:
fly-replay: app=my-worker;timeout=2s
On its own, setting a timeout just makes the replay fail faster.
Replay Fallback
The new fallback field tells our proxy to route the request back to the original Machine that issued the replay, instead of erroring. There are two modes:
force_self— route back to the exact Machine that issued the replay, otherwise error.prefer_self— try the original Machine first, but fall back to any Machine in the original app.
fly-replay: app=my-worker;timeout=5s;fallback=force_self
When a fallback occurs, your app gets the request again with a fly-replay-failed header containing metadata about what went wrong. You can use this to better track the error in your app, and you can also respond with a first-class error to your client, rather than the curt 5xx from our proxy.
Extra notes
- Once a replayed request has hit its fallback, it can’t be replayed again.
- “Failed” here means that the Fly Proxy didn’t manage to make any connection with one of your machines. As soon as a connection to your app is opened, the replay is a success, and your app returning an error will not trigger a fallback.
- The defined timeout is a minimum/best-effort, as we will not cancel requests mid-flight.
- These fields can be defined in the replay header, or in the JSON replay format.