fly-replay header does not work with instance ID

I am trying to use fly-replay as described in the documentation to route a request to a specific instance using the fly-replay header with a value of instance=00bb33ff but it does not work.

I created two apps: cant-replay-instance-host and cant-replay-instance-child where the first should route requests to the second. But when I attempt to make a request to a path that should replay the request to a specific instance (found in the portal), it gives a 503 Service Unavailable and the logs show [error] [PR03] could not find a good candidate within 21 attempts at load balancing. last error: [PR01] no known healthy instances found for route tcp/443. (hint: is your app shut down? is there an ongoing deployment with a volume or are you using the 'immediate' strategy? have your app's instances all reached their hard limit?) for the host.

Both apps are reachable directly, just not with the replay header. If I attempt to use fly-replay with the router app it does work (i.e. you can route to apps but not instances despite what the docs say).

Repro

The apps have identical setups, but different HTTP handlers:

cant-replay-instance-host:

fastify.get("/", async function handler() {
	return "ok";
});

fastify.post("/", async function handler(request, reply) {
	const { instanceId } = request.body;
	console.log(instanceId);
	reply
		.code(204)
		.header("fly-replay", "instance=" + instanceId)
		.send("");
});

fastify.post("/app", async function handler(request, reply) {
	const { instanceId } = request.body;
	console.log(instanceId);
	reply.code(204).header("fly-replay", "app=cant-replay-instance-child").send("");
});

cant-replay-instance-child:

fastify.get("/", async function handler() {
	return "ok";
});

fastify.post("/", async function handler() {
	return "hello from child";
});

fastify.post("/app", async function handler() {
	return "hello from child";
});

I have shortened the code for brevity. A minimal repro with all files needed (Dockerfile, fly.toml, package.json, etc.) is available here: GitHub - bunchsoft/fly-cant-replay-instance-repro

I deployed both apps and get these results:

GET https://cant-replay-instance-child.fly.dev/ gives 200 OK (can reach child)
GET https://cant-replay-instance-host.fly.dev/ gives 200 OK (can reach router)
POST https://cant-replay-instance-host.fly.dev/app gives 200 OK (can use replay to app)
POST https://cant-replay-instance-child.fly.dev/ gives 200 OK (can POST to child directly)
POST https://cant-replay-instance-host.fly.dev/ gives 503 Service Unavailable (cannot use replay with instance ID)

The POST request is with request body

{
  "instanceId": "48e2d10b76e028"
}

where the instance ID was shown in the portal for cant-replay-instance-child (it does not with either instance ID).

Hi… It looks like an explicit app= is needed there, since you’re crossing app boundaries.

fly-replay: instance=48e2d10b76e028;app=cant-replay-instance-child

It’s not clear that Machines IDs are considered independently citable, without an enveloping app to go on…

(Probably the docs could be extended a bit, in this area. The new Support supremo that they’re hiring for will have docs as part of his purview, so hopefully that will usher in an unfreezing of that side of things.)

Hope this helps!

3 Likes

Thank you, that worked for me!

One of the examples in the docs showed routing using just the instance ID but it sounds like that assumes routing within the same app. Anyways, this got me unblocked, I appreciate the help!

1 Like

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.