MCP on high availability

siim · August 5, 2025, 5:28pm

playing around with setting up a MCP server on fly. have like so far what I see, a lot more simple than setting up similar thing on AWS.

Docs on streaming says that --ha=false flag needs to be used. I tried to have multiple instances with and without --ha=false and also adding fly-force-instance-id header, but couldn’t get it to work (was intermittently getting 404 Session not found errors). Is a single streamable mcp server instance app supported at this point or I have missed something?

rubys · August 5, 2025, 5:54pm

It is indeed quite possible to write a high availability MCP server. It is also quite possible to write a MCP server that can’t be run in a high availability mode. It all comes down to the MCP server.

The same thing is true for a Next.JS, Rails, Phoenix, Django, Laravel, or other servers. MCP HTTP streaming is just a protocol, how your server manages state is key.

Most existing MCP servers don’t actually manage state in a way that allows creating replicas to share the workload. An approach that works with such servers is to set up a MCP server per user, with a router app. See: Per-User Dev Environments with Fly Machines · Fly Docs for more information (ignore the “dev environment” parts, the rest is applicable).

siim · August 5, 2025, 6:11pm

Thanks for a quick reply. I see how it can work with a router app that directs traffic where it is needed. Fly router doesn’t do itself (eg for streaming requests fly-force-instance-id header doesn’t have effect)?

rubys · August 5, 2025, 6:17pm

fly-force-instance-id does indeed work - if you have control over your client. Here’s the diagram for MCP HTTP streaming data flow:

You will need to make sure that your MCP client issues the correct fly-force-instance-id on each request.

system · August 12, 2025, 8:24pm

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.