DNS Resolution Failure for .internal Hostname Between Apps

I’m experiencing DNS resolution failures between my Fly.io applications, preventing NGINX from connecting to my backend service. Here are the key details:

Apps Involved

  1. NGINX Proxy
  • App: x-realestate
  • Region: ams
  • Configuration: [nginx.prod.cd1.conf](pastebin link if available)
  1. Express Backend
  • App: x-realestate-server
  • Region: ams
  • Port: 8000 (confirmed listening on 0.0.0.0)

Issue

  • x-realestate-server.internal fails to resolve from the NGINX app:
dig x-realestate-server.internal
;; status: NXDOMAIN  # No DNS record found
  • Fly.io’s DNS (fdaa::3) responds but returns no answer.
  • Symptoms:
    • NGINX cannot proxy requests to the backend.
    • Fly.io health checks fall back to port 8080 (undesired).

Debugging Steps Taken

  1. Verified both apps are in the same org:
fly status -a x-realestate-server
fly status -a x-realestate
  1. Confirmed backend is running:
fly logs -a x-realestate-server | grep "Listening on"
# Output: Listening on 0.0.0.0:8000
  1. Tested DNS from NGINX container:
nslookup x-realestate-server.internal
# Error: "Can't find x-realestate-server.internal: No answer"
  1. Checked Fly.io network:
fly ips list -a x-realestate-server
# IPv6: fdaa:0:ac95:a7b:3f8:ad38:f036:2

Request

  1. Is there a known issue with .internal DNS resolution in ams?
  2. How can I force Fly.io to register the DNS record for x-realestate-server.internal?
  3. Fallback options? (e.g., hardcoding IPs in NGINX).

Logs & Configs:

Hi @yaniv :waving_hand:

Your app x-realestate-server does not seem to have a running machine. Machines that are not currently running will not be returned as part of responses to .internal queries.

If what you are looking for is to have x-realestate-server automatically started when x-realestate requests it, please note that this is only possible if you send requests through our proxy, and .internal addresses are explicitly not routed through our proxy. To have internally-routable proxied addresses, see Flycast.

Also, your fly.toml contains depends_on with .internal addresses. That is not a supported configuration.

Thank you for your last response.
After implemented the above, I have concerns regarding:

1. dual-stack networking conflict in Fly.io’s environment:

curl -g -6 "http://[fdaa:a:ac95:0:1::2]:8000/health" → Connection reset
curl http://127.0.0.1:8000/health → Works
  • Express is properly bound to :: (IPv6 wildcard)
  • Fly.io’s network uses IPv6 primarily
  • Connection resets indicate a network-layer handshake failure

2. Key Findings:

a) Firewall/Network Policy Conflict:

2025-06-21T18:59:09Z [error] 660#660: *1 x-realestate-client.flycast could not be resolved (3: Host not found)
  • Nginx can’t resolve .flycast domains despite correct resolver config
  • Fly.io’s internal DNS (fdaa::3) is intermittently failing

b) Protocol Handling Issue:

Machine started in 3.072s → Health check fails
  • HTTP version mismatch between services
  • Express sends HTTP/1.1 while Nginx expects HTTP/1.0

c) Cold Start Timing:

Machine started in 3.072s → Health check fails
  • Services start before network stack is fully ready
  • DNS resolution fails during initial 5-10 seconds

Files Link

I wonder if this is an generated text placeholder?

if you looking for the link, then its on the Logs & Configs : section,
Otherwise can you be more specific.

Ah, so it is. It’s sometimes worth tidying AI output for this reason.

thank you for that.
but if you looking for the current file links so this hereLink

a bit more info:

1. IPv6 DNS Resolution Issue

Evidence:

2025-06-23T04:50:41Z [error] x-realestate-client.flycast could not be resolved (3: Host not found)
2025-06-23T07:05:23Z [error] nslookup: couldn't get address for '[fdaa::3]': not found

Odd Behavior:

  • .flycast hostnames fail resolution despite being Fly.io’s official internal DNS
  • Their own resolver [fdaa::3] sometimes fails lookup commands

Forum Post Suggestion:
“Internal DNS resolution intermittently fails for .flycast domains and [fdaa::3] resolver in AMS region”

2. Connection Reset Quirk

Evidence:

2025-06-23T07:05:23Z [error] Recv failure: Connection reset by peer
curl: (56) Recv failure: Connection reset by peer

Pattern:

  • Occurs when curling between services via IPv6
  • Despite services listening on correct ports (netstat confirms)

3. Health Check False Positives

Evidence:

2025-06-23T07:05:22Z [error] Health check failed (port 8080)
2025-06-23T07:05:22Z [notice] Nginx started successfully

Inconsistency:

  • Health checks fail despite Nginx being fully operational
  • Occurs during VM startup sequence

Broken Internal DNS

x-realestate-client.flycast could not be resolved (3: Host not found)
nslookup: couldn't get address for '[fdaa::3]'

Fly’s internal DNS system is failing in AMS region.

TCP Connection Resets:

Recv failure: Connection reset by peer (56)

Occurs on all IPv6 connections between services.
Health Check Flapping:

Health check failed → Nginx started successfully → Health check passed
  1. Monitoring system isn’t synced with service readiness.

Application Insights

  1. Services Are Healthy:
  • Both client and server respond via public URLs
  • Issues are purely network-layer
  1. Cold Start Behavior:
Machine started in 1.119s → Health check failed → Became reachable in 1.235s

Any Helper response will be appreciated. its not working.

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.