Best way to internally send a request to ALL running vms?

I know there’s fly-prefer-region and fly-replay, and that I can access other regions over internal DNS with region.fly-app-name.fly.dev (all awesome).

But because regions might change (say switching to a backup), is there an easy way to either get a current list of active regions for the app, or a way to send a request to all of the regions at once from inside a vm?

I’m looking to migrate away from a script-check that pulls remote data in to every region every X minutes, and need something that can reliably ping all of the regions to tell them there’s new data to pull.

1 Like

Hey Carter,

Believe what you need is option 1 here.

name aaaa txt
global.<appname>.internal app instances in all regions none
regions.<appname>.internal none region names where app is deployed
<appname>.internal app instances in any region none

Ref: https://fly.io/docs/reference/private-networking/#fly-internal-addresses

So if I sent a request to global..internal/some/path that would be relayed to every region? Only asking because when I tried that before, it didn’t seem to work but I may have been missing something.

Ah no, you’ve to make a DNS lookup to that name, and you’ll get a response that you can use.

EDIT: If it’s a Node app, it’ll be something like this.

I’m looking to migrate away from a script-check that pulls remote data in to every region every X minutes, and need something that can reliably ping all of the regions to tell them there’s new data to pull.

Not a systems design expert, but doing periodic pulls is a better design, rather than a notification based approach. That is, pulling in configuration every 60s, say, regardless of changes to it, is more resilient than the more optimized version of pulling-in on-demand: https://archive.fo/2S2ME#selection-14028.0-14028.1

That said, using Fly’s 6PN (e2ee internal network over IPv6 among all VMs of a Fly app) it should be straight-forward (than otherwise it would have been) to build this without having to worry about security aspects of the connections (as FrequentFlyer pointed out).

flyctl ssh console -a <appname>

...

nslookup global.<appname>.internal fdaa::3

Server:		fdaa::3
Address:	[fdaa::3]:53

Non-authoritative answer:
Name:	global.rdns.internal
Address: fdaa:0:<org-id>:<fly-router-id>:<vm-id>:1111:5555:2
Address: fdaa:0:<org-id>:<fly-router-id>:<vm-id>:4444:7777:2
Address: fdaa:0:<org-id>:<fly-router-id>:<vm-id>:2222:3333:2
Address: fdaa:0:<org-id>:<fly-router-id>:<vm-id>:8888:9999:2

# ref: https://community.fly.io/1362/2
# ref: https://archive.is/M5sJe#selection-702.0-702.4

We built a similar broadcast mechanism with WebSockets (ws) connected to Durable Objects instead (ie broadcasts to clients that can speak ws).

I am compelled to point out that Deno Deploy has BroadcastChannels handy and ready-to-go for such a use-case.

1 Like

A quick note here that if what you’re talking to is Fly.io apps talking to other Fly.io apps in your organization, you can use fly-app-name.internal to find their 6PN private IPv6 addresses. A query for just fly-app-name.internal will give you the address of every instance in your deployment; you can narrow it down with region.fly-app-name.internal (like nrt.foo.internal), or with top3.nearest.of.foo.internal (or top1, or top10).

This only works for communication inside of Fly.io; you can’t use it to connect to your app from a random host on the Internet.

Is global.app.internal redundant then?

@carter is this Node.js or another runtime? Here’s some TypeScript from a node project doing this:

import { resolve6 } from "dns/promises";

async function getFlyInstances(): Promise<string[]> {
  let address = `global.${process.env.FLY_APP_NAME}.internal`;
  let ipv6s = await resolve6(address);
  return ipv6s.map((ip) => `http://[${ip}]:8080`);
}
const instances = await getFlyInstances();

    for (const instance of instances) {
      const url = new URL(instance);
      url.pathname = "/_refreshlocal";
      url.search = search;

      console.log(`forwarding post to ${url.toString()}`);

      // we purposefully don't await, we're just notifying everybody
      fetch(url.toString(), {
        method: "POST",
        headers: {
          Authorization: process.env.AUTH_TOKEN!,
        },
      });
    }

This grabs the internal IPv6 addresses, then sends an HTTP request to each of them in parallel.

2 Likes

Thanks @kurt, these are actually customized caddy instances with as minimal an image as possible, so I didn’t have any app code running to do a DNS request like this, but I can build something to handle it.

Is there a graphql query I could make to get the regions that are actually active right now instead of the ones that are set? The problem I’m running into is that sometimes it’s on a backup region but my external app doesn’t know that. So when it sends an update/health check to SEA for instance, if it’s running on backup LAX instead, the app thinks something is wrong.

For now I’m just pinging the fly.dev url with a request for all 20ish regions, having caddy proxy it using the internal regions addresses, and letting the nonexistent ones fail. It would be nicer to check the active regions though, run against them only, and then also have a recurring pull from each region (which I still need to find a better solution for than the script checks).

Thanks for the help @ignoramous! I appreciate it. I agree, after my testing this last week, I think the ideal solution is a recurring pull from each region, but less frequent (maybe 10 mins), to ensure eventual consistency. And then also push changes to each region on each change to get quick, light weight updates as soon as possible.

1 Like

Is there a graphql query I could make to get the regions that are actually active right now instead of the ones that are set?

One of the examples here may help - GitHub - fly-apps/hostnamesapi: JavaScript examples for working with the new hostnames API on Fly

Going here, you’ll get a playground with the schema & docs - GraphQL Playground

1 Like

We built something very similar.

We use a count-min-sketch that (data plane) servers pull in every 1m in all regions. This sketch matrix contains current version of various configurations. These sketches are hosted on Durable Objects. These sketches are updated/created by the control plane as and when it updates/creates new configuration (typically in servicing a user request).

The (data plane) servers compare (intersect/xor) the incoming sketch with the ones they already have and pull in only those configurations that have newer versions (higher count).

Apart from this, (data plane) servers also pull in the full configuration every 30d, discarding / resetting sketches.

I don’t like this design because of its many operating modes, but it is the cheapest I could think of.

Btw, this falls under the topic of “set reconciliation”, a pretty popular problem in the blockchain space (you’ve been warned :wink:

See: https://github.com/sipa/minisketch

https://martin.kleppmann.com/2020/12/02/bloom-filter-hash-graph-sync.html

I am trying to get this to work but no success yet.
One organization, multiple apps.
Resolving the .internal address of app A in DNS works, but when another app sends a request to the target app using the resolved IPv6 of the target app, the request fails with ECONNREFUSED.
Does this error signify the problem is on my side, in my target app?

That probably means the app isn’t listening on IPv6. Most apps listen on 0.0.0.0 by default, which is ipv4 only. Listening on IPv4 and IPv6 looks more like this ::. That varies a little bit per runtime, but it almost always includes a ::.

1 Like

Aha right, txs!

https://www.grouparoo.com/blog/node-js-and-ipv6

The article shows that using ::as the hostname works for v4 and v6

The hostname of :: works with IPv4 addresses because it is backwards compatible. Technically, we have only bound to an IPv6 address, but IPv6 can still handle the older style of connections.