I know there’s fly-prefer-region and fly-replay, and that I can access other regions over internal DNS with region.fly-app-name.fly.dev (all awesome).
But because regions might change (say switching to a backup), is there an easy way to either get a current list of active regions for the app, or a way to send a request to all of the regions at once from inside a vm?
I’m looking to migrate away from a script-check that pulls remote data in to every region every X minutes, and need something that can reliably ping all of the regions to tell them there’s new data to pull.
So if I sent a request to global..internal/some/path that would be relayed to every region? Only asking because when I tried that before, it didn’t seem to work but I may have been missing something.
I’m looking to migrate away from a script-check that pulls remote data in to every region every X minutes, and need something that can reliably ping all of the regions to tell them there’s new data to pull.
Not a systems design expert, but doing periodic pulls is a better design, rather than a notification based approach. That is, pulling in configuration every 60s, say, regardless of changes to it, is more resilient than the more optimized version of pulling-in on-demand: https://archive.fo/2S2ME#selection-14028.0-14028.1
That said, using Fly’s 6PN (e2ee internal network over IPv6 among all VMs of a Fly app) it should be straight-forward (than otherwise it would have been) to build this without having to worry about security aspects of the connections (as FrequentFlyer pointed out).
A quick note here that if what you’re talking to is Fly.io apps talking to otherFly.io apps in your organization, you can use fly-app-name.internal to find their 6PN private IPv6 addresses. A query for just fly-app-name.internal will give you the address of every instance in your deployment; you can narrow it down with region.fly-app-name.internal (like nrt.foo.internal), or with top3.nearest.of.foo.internal (or top1, or top10).
This only works for communication inside of Fly.io; you can’t use it to connect to your app from a random host on the Internet.
Thanks @kurt, these are actually customized caddy instances with as minimal an image as possible, so I didn’t have any app code running to do a DNS request like this, but I can build something to handle it.
Is there a graphql query I could make to get the regions that are actually active right now instead of the ones that are set? The problem I’m running into is that sometimes it’s on a backup region but my external app doesn’t know that. So when it sends an update/health check to SEA for instance, if it’s running on backup LAX instead, the app thinks something is wrong.
For now I’m just pinging the fly.dev url with a request for all 20ish regions, having caddy proxy it using the internal regions addresses, and letting the nonexistent ones fail. It would be nicer to check the active regions though, run against them only, and then also have a recurring pull from each region (which I still need to find a better solution for than the script checks).
Thanks for the help @ignoramous! I appreciate it. I agree, after my testing this last week, I think the ideal solution is a recurring pull from each region, but less frequent (maybe 10 mins), to ensure eventual consistency. And then also push changes to each region on each change to get quick, light weight updates as soon as possible.
We use a count-min-sketch that (data plane) servers pull in every 1m in all regions. This sketch matrix contains current version of various configurations. These sketches are hosted on Durable Objects. These sketches are updated/created by the control plane as and when it updates/creates new configuration (typically in servicing a user request).
The (data plane) servers compare (intersect/xor) the incoming sketch with the ones they already have and pull in only those configurations that have newer versions (higher count).
Apart from this, (data plane) servers also pull in the full configuration every 30d, discarding / resetting sketches.
I don’t like this design because of its many operating modes, but it is the cheapest I could think of.
Btw, this falls under the topic of “set reconciliation”, a pretty popular problem in the blockchain space (you’ve been warned
I am trying to get this to work but no success yet.
One organization, multiple apps.
Resolving the .internal address of app A in DNS works, but when another app sends a request to the target app using the resolved IPv6 of the target app, the request fails with ECONNREFUSED.
Does this error signify the problem is on my side, in my target app?
That probably means the app isn’t listening on IPv6. Most apps listen on 0.0.0.0 by default, which is ipv4 only. Listening on IPv4 and IPv6 looks more like this ::. That varies a little bit per runtime, but it almost always includes a ::.
The article shows that using ::as the hostname works for v4 and v6
The hostname of :: works with IPv4 addresses because it is backwards compatible. Technically, we have only bound to an IPv6 address, but IPv6 can still handle the older style of connections.