Cross-region HTTP via 6PN and .internal DNS - Connection Refused

I’m trying to implement cross-region health monitoring where machines in different regions need
to HTTP health-check each other. Neither approach is working.

Setup:

  • App: ezthrottle-staging
  • Machines deployed in: iad, lax, ord (confirmed via fly machines list)
  • All machines: min_machines_running = 1, binding to :::8080

Approach 1: .internal DNS :cross_mark:

From iad machine trying to reach ord:

curl http://ord.ezthrottle-staging.internal:8080/ping
→ Connection refused (despite machine being “started”)

DNS resolution WORKS:

dig ord.ezthrottle-staging.internal AAAA
→ Returns IPv6: fdaa:2d:d853:a7b:569:8c3a:bddb:2

Direct IPv6 WORKS:

curl http://[fdaa:2d:d853:a7b:569:8c3a:bddb:2]:8080/ping
→ Success!

Approach 2: fly-prefer-region header :cross_mark:

All requests go to lax regardless of header:

curl -s https://ezthrottle-staging.fly.dev/ping
-H “fly-prefer-region: iad” | jq .region
→ “lax”

curl -s https://ezthrottle-staging.fly.dev/ping
-H “fly-prefer-region: ord” | jq .region
→ “lax”

Questions:

  1. Does 6PN support cross-region HTTP? Or only same-region?
  2. Why does .internal DNS resolve but connections are refused?
  3. Should fly-prefer-region work from public internet or only internally?
  4. What’s the recommended way for machines to communicate cross-region?

Happy to provide more config details if needed!

It works across regions; indeed, that was originally its main reason for existing, as I understand the history…

That really should have been ok, particularly since the attempt with the IPv6 numeric literal succeeded. Perhaps the Machines themselves are flakey? I tried connecting to ezthrottle-staging.fly.dev from here and got timeouts…

I’d also suggest using curl -6 -v in your attempts, to garner more details.

This is a feature of the Fly Proxy and hence only works with .fly.dev and .flycast, not with .internal. That third one is rather low-level and bypasses the proxy completely.

You can try fly-prefer-region out via curl -i -H 'fly-prefer-region: ams' -H 'flyio-debug: doit' 'https://debug.fly.dev/'. Look in the sdc field of the returned flyio-debug header.

(In contrast, the Fly-Region header shows where the client is.)

[There’s an unfortunate name collision with the FLY_REGION environment variable.]

It depends! When you want things like auto-start and coarse-grained load balancing, you should use Flycast. But for these health checks, .internal sounds like it would be a better match.

It might help to show the full output of fly m list, along with the entirety of fly.toml. It’s also generally wise to do a fly config validate --strict whenever you see puzzling behavior on the Fly.io platform…

Other than what @mayailurus mentioned above,

It shouldn’t really be the case – and we can’t reject your connections over 6PN (or .internal) addresses on behalf of your machines. 6PN addresses are not served through Fly Proxy and go directly to target machines. The message Connection refused can really only happen if your machine isn’t listening on the requested port (yet).

app = ‘ezthrottle-staging’
primary_region = ‘iad’

[build]
dockerfile = ‘Dockerfile’

[http_service]
internal_port = 8080
force_https = true
auto_stop_machines = false  # Don’t auto-stop on inactivity
auto_start_machines = true  # Still wake up on traffic
min_machines_running = 1    # Keep at least 1 machine running per region

[[vm]]
cpu_kind = ‘shared’
cpus = 1
memory_mb = 256

[env]

dockerfile

FROM hexpm/elixir:1.17.3-erlang-27.2-debian-bullseye-20241202 AS builder

Install build dependencies

RUN apt-get update && 
apt-get install -y build-essential git curl cargo rustc wget && 
rm -rf /var/lib/apt/lists/*

Install Rust

RUN curl --proto ‘=https’ --tlsv1.2 -sSf https://sh.rustup.rs | sh -s – -y
ENV PATH=“/root/.cargo/bin:${PATH}”

Build Gleam from source

RUN git clone --depth 1 --branch v1.13.0 https://github.com/gleam-lang/gleam.git /tmp/gleam && 
cd /tmp/gleam && 
cargo build --release && 
cp target/release/gleam /usr/local/bin/gleam && 
rm -rf /tmp/gleam

Install rebar3

RUN wget https://github.com/erlang/rebar3/releases/download/3.24.0/rebar3 && 
chmod +x rebar3 && 
mv rebar3 /usr/local/bin/

Setup Elixir

RUN mix local.hex --force && mix local.rebar --force

WORKDIR /app

Copy and build

COPY gleam.toml manifest.toml ./
RUN gleam deps download
COPY . .
RUN rm -rf build _build && gleam build --target erlang

Runtime

FROM hexpm/elixir:1.17.3-erlang-27.2-debian-bullseye-20241202

RUN apt-get update && 
apt-get install -y libstdc++6 openssl libncurses5 locales ca-certificates 
curl dnsutils iputils-ping net-tools iproute2 && 
rm -rf /var/lib/apt/lists/*

RUN sed -i ‘/en_US.UTF-8/s/^# //g’ /etc/locale.gen && locale-gen
ENV LANG=en_US.UTF-8

WORKDIR /app

Copy app and gleam binary

COPY --from=builder /app /app
COPY --from=builder /usr/local/bin/gleam /usr/local/bin/gleam
COPY --from=builder /usr/local/bin/rebar3 /usr/local/bin/rebar3

Ensure start.sh is executable

RUN chmod +x /app/start.sh

Setup Elixir in runtime too

RUN mix local.hex --force && mix local.rebar --force

ENV BIND_ADDRESS=0.0.0.0
EXPOSE 8080

CMD [“/app/start.sh”]

start.sh

#!/bin/bash
APP_NAME=${FLY_APP_NAME:-ezthrottle}

Configure Erlang to use native DNS resolver (respects /etc/resolv.conf)

inet_lookup affects ALL DNS lookups (not just distributed Erlang)

export ERL_FLAGS=“-name ${APP_NAME}@${FLY_PRIVATE_IP:-127.0.0.1} -setcookie ${RELEASE_COOKIE:-local-dev-cookie} -proto_dist inet6_tcp -kernel inet_lookup [native]”
exec gleam run

main function


import actors/job_dispatcher
import actors/machine_actor
import actors/pulsekeeper_actor
import envoy
import gleam/erlang/process
import gleam/int
import gleam/result
import gleam/string
import glixir/libcluster
import glixir/syn
import logging
import mist
import utils/utils
import web/router
import wisp
import wisp/wisp_mist

pub fn main() {
// Set up node name if on Fly
case utils.get_env_or(“FLY_PRIVATE_IP”, “”) {
“” → Nil
// Local dev, no node setup needed
ip → {
let app_name = utils.get_env_or(“FLY_APP_NAME”, “ezthrottle”)
envoy.set(“RELEASE_NODE”, app_name <> “@” <> ip)
}
}

case start_link() {
Ok(_pid) → process.sleep_forever()
Error(e) → {
logging.log(logging.Error, "EZThrottle failed to boot: " <> e)
panic as “EZThrottle failed to boot”
}
}
}

pub fn start_link() → Result(process.Pid, String) {
logging.configure()

// ✅ Validate TracktTags environment variables at startup
validate_tracktags_config()

// Start clustering based on environment
start_clustering()

// Initialize syn scopes for actor coordination
// Syn will automatically sync across all discovered nodes
syn.init_scopes(utils.all_scopes())

// Start the actor system
let _machine = start_actor_system()

// Configure HTTP server port
let port_str =
utils.get_env_or(“EZTHROTTLE_PORT”, utils.get_env_or(“PORT”, “8080”))
let port = case int.parse(port_str) {
Ok(p) → p
Error(_) → {
logging.log(logging.Warning, “Invalid port value, using default 8080”)
8080
}
}

logging.log(
logging.Info,
"[EZThrottle] Starting HTTP server on port " <> int.to_string(port),
)

// Boot HTTP server
wisp.configure_logger()
let secret_key_base = wisp.random_string(64)
let handler = wisp_mist.handler(router.handle_request, secret_key_base)

// Bind to :: (all IPv6 interfaces) on Fly for 6PN access, BIND_ADDRESS for local dev
// On Fly: Always use :: which includes 6PN interface (ignore BIND_ADDRESS env var)
// Local: Use BIND_ADDRESS env var (set in Dockerfile to 0.0.0.0)
let bind_address = case utils.get_env_or(“FLY_APP_NAME”, “”) {
“” → utils.get_env_or(“BIND_ADDRESS”, “0.0.0.0”)
// Local development
_ → “::”
// Fly production/staging - :: for all IPv6 interfaces including 6PN
}

logging.log(logging.Info, "Bind address: " <> bind_address)
handler
|> mist.new
|> mist.bind(bind_address)
|> mist.port(port)
|> mist.start()
|> result.map(fn(started) { started.pid })
|> result.map_error(fn(e) { "Failed to start Mist: " <> string.inspect(e) })
}

fn start_clustering() {
let fly_app = utils.get_env_or(“FLY_APP_NAME”, “”)

case fly_app {
“” → {
// No FLY_APP_NAME = local development
logging.log(
logging.Info,
“[EZThrottle] Running in local/single-node mode”,
)
case libcluster.start_clustering_local(“ezthrottle”) {
Ok() → 
logging.log(logging.Info, “[EZThrottle] Local clustering started”)
Error() → 
logging.log(logging.Info, “[EZThrottle] Running without clustering”)
}
}
app_name → {
// FLY_APP_NAME is set = running on Fly
logging.log(
logging.Info,
"[EZThrottle] Starting Fly.io clustering for: " <> app_name,
)
case libcluster.start_clustering_fly(app_name) {
Ok(_) → {
logging.log(
logging.Info,
“[EZThrottle] Clustering started successfully”,
)
logging.log(
logging.Info,
"[EZThrottle] Node: " <> libcluster.current_node_name(),
)
}
Error(e) → {
logging.log(
logging.Warning,
"[EZThrottle] Clustering failed: " <> string.inspect(e),
)
}
}
}
}
// Log cluster status
case libcluster.is_clustered() {
True → {
let nodes = libcluster.connected_node_names()
logging.log(
logging.Info,
"[EZThrottle] Connected to nodes: " <> string.inspect(nodes),
)
}
False → {
logging.log(logging.Info, “[EZThrottle] Running in standalone mode”)
}
}
}

fn start_actor_system() {
logging.log(logging.Info, “[EZThrottle] Starting actor system”)

// Start Pulsekeeper first for region health monitoring
logging.log(logging.Info, “[EZThrottle] Starting Pulsekeeper”)
// Read FLY_APP_NAME (e.g., “ezthrottle-staging”) to construct .internal DNS
let app_name = utils.get_env_or(“FLY_APP_NAME”, “localhost”)
let assert Ok(pulsekeeper) = pulsekeeper_actor.start(app_name)

// Start JobDispatcher
logging.log(logging.Info, “[EZThrottle] Starting JobDispatcher”)
let assert Ok(_dispatcher) = job_dispatcher.start()

// Get machine ID from Fly environment or generate one
let machine_id = case utils.get_env_or(“FLY_MACHINE_ID”, “”) {
“” → “localmachine-1”
// fallback for local dev
id → id
}

// Then start MachineActor with actual ID and Pulsekeeper reference
logging.log(
logging.Info,
"[EZThrottle] Starting MachineActor: " <> machine_id,
)
let assert Ok(machine) = machine_actor.start(machine_id, pulsekeeper)

machine
}

fn validate_tracktags_config() {
// Check if TracktTags is configured
let api_key = utils.get_env_or(“TRACKTAGS_API_KEY”, “”)
let business_id = utils.get_env_or(“TRACKTAGS_BUSINESS_ID”, “”)

case api_key, business_id {
“”, “” → {
logging.log(
logging.Warning,
“[EZThrottle] TracktTags not configured - metrics will be disabled”,
)
}
“”, _ → {
logging.log(
logging.Error,
“[EZThrottle] TRACKTAGS_BUSINESS_ID set but TRACKTAGS_API_KEY missing!”,
)
panic as “TracktTags misconfigured: missing TRACKTAGS_API_KEY”
}
_, “” → {
logging.log(
logging.Error,
“[EZThrottle] TRACKTAGS_API_KEY set but TRACKTAGS_BUSINESS_ID missing!”,
)
panic as “TracktTags misconfigured: missing TRACKTAGS_BUSINESS_ID”
}
_, _ → {
logging.log(
logging.Info,
“[EZThrottle] TracktTags configured - metrics enabled”,
)
}
}
}

Hm… I would try the following simpler TCP listener, to rule out a (rare) actual network glitch of some kind:

$ fly m start  # start all Machines.
$ fly ssh console --region ord
# apt-get update
# apt-get install --no-install-recommends socat
# echo "hello from $FLY_REGION." > greeting
# socat TCP6-LISTEN:8089 OPEN:greeting,rdonly

And then, in a separate terminal window, while the above is still running…

$ fly ssh console --region iad
# apt-get update
# apt-get install --no-install-recommends socat
# socat STDIO TCP6:ord.ezthrottle-staging.internal:8089
hello from ord.

(This assumes that you have only that one Machine in ord.)

Thanks for the help. I figure it out. I think it was a bunch of misconfigurations from port to syn to old web servers. If you don’t mind, I am starting a startup on fly. Is there anyway I can get in contact with annie or the developer advocate. Are there any resources I can utilize?

1 Like

The upper two Support plans both include “architecture sessions”, which I hear good things about. Also, the relatively new Guides section of the official docs coalesces a lot of recurring themes from those, from what they’ve said in the past. Those would be the best places to look next, I’d guess.

Best of luck!

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.