Intentional? Private networking dns lookups fail when target scaled to 0

So I have two backend services I want to communicate over the private network (dns name <appname>.internal).

Everything works fine when the target is online, but when it is offline/scaled to zero, the dns lookup fails, and I am forced to use the public domain name instead (<appname>.fly.dev).

This is intentional? and if so: Does scaling also only work for public requests?

Since the concurrency config is on the internal port in service configuration, I would assume scaling should work on internal requests as well… but it doesn’t seem to work correctly, at least not the dns lookup :S

2 Likes

That is a good find, I was able to reproduce it just now.

[I] ➜ fly m stop 17811437b99628 -a APP_NAME
Sending kill signal to machine 17811437b99628...
17811437b99628 has been successfully stopped

[I] ➜ curl http://APP_NAME.internal:8080   
curl: (6) Could not resolve host: APP_NAME.internal

[I] ➜ dig AAAA APP_NAME.internal

; <<>> DiG 9.10.6 <<>> AAAA APP_NAME.internal
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 13959
;; flags: qr aa rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 0

;; QUESTION SECTION:
;APP_NAME.internal.        IN      AAAA

;; Query time: 130 msec
;; SERVER: fdaa:0:5463::3#53(fdaa:0:5463::3)
;; WHEN: Sun Jul 23 10:09:31 -03 2023
;; MSG SIZE  rcvd: 48


[I] ➜ fly m start 17811437b99628 -a APP_NAME
17811437b99628 has been started

[I] ➜ dig AAAA APP_NAME.internal            

; <<>> DiG 9.10.6 <<>> AAAA APP_NAME.internal
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 39344
;; flags: qr aa rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 0

;; QUESTION SECTION:
;APP_NAME.internal.        IN      AAAA

;; ANSWER SECTION:
APP_NAME.internal. 5 IN    AAAA    fdaa:0:5463:a7b:136:2300:520d:2

;; Query time: 109 msec
;; SERVER: fdaa:0:5463::3#53(fdaa:0:5463::3)
;; WHEN: Sun Jul 23 10:09:48 -03 2023
;; MSG SIZE  rcvd: 106

It even takes some time before I can curl APP_NAME.internal:8080.

I’ll make sure to bubble up this issue but for now, my recommendations are:

  • If you need those backend services to be private, always have one machine running by using min_machines_running = 1. Fly Launch configuration (fly.toml) · Fly Docs
  • If you can afford those machines being public use .fly.dev domains and proxy will wake them up as needed.

Thanks for confirming. Yep, I’m just routing the traffic through the public proxy for now. Hopefully it wont count towards egress when my fly.io executing services reaches out for the public proxy :smiley:

Here’s an update on this.

IPs are allocated to running machines so DNS lookups for stopped machines won’t show results by design.

That being said you can get a static private IP for your app using flycast:

Hope this help with your use case!

Ok thanks. I was able to add the private IP and the dns lookup now succeeds.
I am also able to open a tcp connection to that private IP (on any port)…

but, the connection doesn’t get passed on to the target service… Not quite sure what I’m doing wrong here.

.internal works:

root@......:/flycd# nc -v <app-name>.internal 3000 
Connection to <app-name>.internal (fdaa:......:4326:2) 3000 port [tcp/*] succeeded!
hej
HTTP/1.1 400 Bad Request
Connection: close

But .flycast doesn’t seem to forward the tcp connection

root@......:/flycd# nc -v <app-name>.flycast 3000
Connection to <app-name>.flycast (....:1::2) 3000 port [tcp/*] succeeded!
hej

The IP found by <app-name>.flycast dns lookup above matches that one created in fly ips list

~> fly ips list -a <app-name>
VERSION   	IP                  	TYPE           	REGION	CREATED AT           
v6        	...........::69:35df	public         	global	2023-07-02T18:38:08Z	
private_v6	............:0:1::2  	private        	global	10m53s ago          	
v4        	............12       	public (shared)

The target service has a single services section (not publicly exposed)

~> fly config show -a <app-name>
{
  "app": <app-name>,
  "primary_region": <region>,
  "env": {
    ....
  },
  "services": [
    {
      "protocol": "tcp",
      "internal_port": 3000,
      "auto_stop_machines": true,
      "auto_start_machines": true,
      "min_machines_running": 1
    }
  ]
}

But no data seems to be sent through the internal/private proxy to the target service :S

Maybe I need to use --no-public-ips and then add a ports section to the config?

Turns out that this solves it:

  • delete all public ips (since the service was initially created without --no-public-ips)
  • add ports section or http_service to expose the internal port on the flycast ip
3 Likes

I’m glad it’s working, thanks for sharing your use case with our community too!

1 Like

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.