Configure pihole to listen only on Tailscale network

I used this guide to set up a Tailscale node:

It works well, but my primary goal isn’t to use Fly as an exit node, rather I wish to host a pihole that listens only to requests from my Tailscale network.

Before deploying the pihole image, my fly.toml looked like this:

# fly.toml file generated for sparkling-snow-565 on 2021-11-18T20:15:20+05:30

app = "sparkling-snow-565"

kill_signal = "SIGINT"
kill_timeout = 5

[env]
  PORT = "41641"

[experimental]
  auto_rollback = false
  private_network = true

[[services]]
  internal_port = 41641
  protocol = "udp"

  [[services.ports]]
    port = "41641"

And the Dockerfile like this:

ARG TSFILE=tailscale_1.16.2_amd64.tgz

FROM alpine:latest as tailscale
ARG TSFILE
WORKDIR /app
RUN wget https://pkgs.tailscale.com/stable/${TSFILE} && \
  tar xzf ${TSFILE} --strip-components=1
COPY . ./


FROM alpine:latest
RUN apk update && apk add ca-certificates iptables ip6tables iproute2 && rm -rf /var/cache/apk/*

# Copy binary to production image
COPY --from=tailscale /app/start.sh /app/start.sh
COPY --from=tailscale /app/tailscaled /app/tailscaled
COPY --from=tailscale /app/tailscale /app/tailscale
RUN mkdir -p /var/run/tailscale
RUN mkdir -p /var/cache/tailscale
RUN mkdir -p /var/lib/tailscale

# Run on container startup.
USER root
CMD ["/app/start.sh"]

Deploying this worked well. A Tailscale node is created.

Now, I am trying to integrate Pihole into this existing app:

So, I updated the fly.toml file to the below:

# fly.toml file generated for sparkling-snow-565 on 2021-11-18T20:15:20+05:30

app = "sparkling-snow-565"

kill_signal = "SIGINT"
kill_timeout = 5

[env]
  PORT = "41641"

[experimental]
  auto_rollback = false
  private_network = true

[[services]]
  internal_port = 41641
  protocol = "udp"

  [[services.ports]]
    port = "41641"

[[services]]
  internal_port = 53
  protocol = "udp"

  [[services.ports]]
    port = "53"

[[services]]
  internal_port = 80
  protocol = "tcp"

  [services.concurrency]
    hard_limit = 25
    soft_limit = 20

  [[services.ports]]
    handlers = []
    port = "80"

  [[services.ports]]
    handlers = ["tls"]
    port = "443"

  [[services.tcp_checks]]
    interval = 10000
    timeout = 2000

And Dockerfile to the below:

ARG TSFILE=tailscale_1.16.2_amd64.tgz

FROM pihole/pihole:latest

ENV INTERFACE tailscale0
ENV DNSMASQ_LISTENING ALL

FROM alpine:latest as tailscale
ARG TSFILE
WORKDIR /app
RUN wget https://pkgs.tailscale.com/stable/${TSFILE} && \
  tar xzf ${TSFILE} --strip-components=1
COPY . ./


FROM alpine:latest
RUN apk update && apk add ca-certificates iptables ip6tables iproute2 && rm -rf /var/cache/apk/*

# Copy binary to production image
COPY --from=tailscale /app/start.sh /app/start.sh
COPY --from=tailscale /app/tailscaled /app/tailscaled
COPY --from=tailscale /app/tailscale /app/tailscale
RUN mkdir -p /var/run/tailscale
RUN mkdir -p /var/cache/tailscale
RUN mkdir -p /var/lib/tailscale

# Run on container startup.
USER root
CMD ["/app/start.sh"]

Running fly deploy runs the Docker workflow but eventually hits a critical state and the app fails:

➜  fly-tailscale-exit git:(main) ✗ fly deploy
Deploying sparkling-snow-565
==> Validating app configuration
--> Validating app configuration done
Services
UDP 41641 ⇢ 41641
UDP 53 ⇢ 53
TCP 80/443 ⇢ 80
Remote builder fly-builder-bold-tree-8037 ready
==> Creating build context
--> Creating build context done
==> Building image with Docker
--> docker host: 20.10.10 linux x86_64
Sending build context to Docker daemon  33.39kB
Step 1/19 : ARG TSFILE=tailscale_1.16.2_amd64.tgz
Step 2/19 : FROM pihole/pihole:latest
 ---> 38a1df57a04f
Step 3/19 : ENV INTERFACE tailscale0
 ---> Using cache
 ---> 87105740cad2
Step 4/19 : ENV DNSMASQ_LISTENING ALL
 ---> Using cache
 ---> 52980a20c17e
Step 5/19 : FROM alpine:latest as tailscale
 ---> 0a97eee8041e
Step 6/19 : ARG TSFILE
 ---> Using cache
 ---> 7b8058ccc7c3
Step 7/19 : WORKDIR /app
 ---> Using cache
 ---> ba46e5420393
Step 8/19 : RUN wget https://pkgs.tailscale.com/stable/${TSFILE} &&   tar xzf ${TSFILE} --strip-components=1
 ---> Using cache
 ---> 941e787775d1
Step 9/19 : COPY . ./
 ---> fb7cce6a7310
Step 10/19 : FROM alpine:latest
 ---> 0a97eee8041e
Step 11/19 : RUN apk update && apk add ca-certificates iptables ip6tables iproute2 && rm -rf /var/cache/apk/*
 ---> Using cache
 ---> c5d75753fe14
Step 12/19 : COPY --from=tailscale /app/start.sh /app/start.sh
 ---> Using cache
 ---> 9f9fc87b2e51
Step 13/19 : COPY --from=tailscale /app/tailscaled /app/tailscaled
 ---> Using cache
 ---> 719067f99154
Step 14/19 : COPY --from=tailscale /app/tailscale /app/tailscale
 ---> Using cache
 ---> af3e0990642f
Step 15/19 : RUN mkdir -p /var/run/tailscale
 ---> Using cache
 ---> 530ac0bdd2dc
Step 16/19 : RUN mkdir -p /var/cache/tailscale
 ---> Using cache
 ---> 5be568d0b54d
Step 17/19 : RUN mkdir -p /var/lib/tailscale
 ---> Using cache
 ---> 03d070d963cb
Step 18/19 : USER root
 ---> Using cache
 ---> c505d5e5a92e
Step 19/19 : CMD ["/app/start.sh"]
 ---> Running in 04e645d042be
 ---> 34b5282ba8db
Successfully built 34b5282ba8db
Successfully tagged registry.fly.io/sparkling-snow-565:deployment-1637255004
--> Building image done
==> Pushing image to fly
The push refers to repository [registry.fly.io/sparkling-snow-565]
8b9c1be36dd5: Layer already exists
6222eb63e256: Layer already exists
731bf69a19de: Layer already exists
01543b960e98: Layer already exists
a403cb938d74: Layer already exists
919745deef4d: Layer already exists
a851936b4701: Layer already exists
1a058d5342cc: Layer already exists
deployment-1637255004: digest: sha256:8618b1c4cd26512ca7988b7e96bc6656cd792bc0d4656f9ea0a886ffe5c20d56 size: 1989
--> Pushing image done
Image: registry.fly.io/sparkling-snow-565:deployment-1637255004
Image size: 40 MB
==> Creating release
Release v4 created

You can detach the terminal anytime without stopping the deployment
Monitoring Deployment

1 desired, 1 placed, 0 healthy, 1 unhealthy [health checks: 1 total, 1 critical]
v4 failed - Failed due to unhealthy allocations
Failed Instances

==> Failure #1

Instance
  ID            = e39871e1
  Process       =
  Version       = 4
  Region        = sin
  Desired       = run
  Status        = running
  Health Checks = 1 total, 1 critical
  Restarts      = 0
  Created       = 4m57s ago

Recent Events
TIMESTAMP            TYPE       MESSAGE
2021-11-18T17:03:55Z Received   Task received by client
2021-11-18T17:03:55Z Task Setup Building Task Directory
2021-11-18T17:04:04Z Started    Task started by client

Recent Logs
2021-11-18T17:04:10.000 [info] Tailscale started. Lets go!
2021-11-18T17:04:10.000 [info] 2021/11/18 17:04:10 control: NetInfo: NetInfo{varies=false hairpin=false ipv6=false udp=true derp=#3 portmap= link=""}
2021-11-18T17:04:10.000 [info] 2021/11/18 17:04:10 control: cancelMapSafely: synced=false
2021-11-18T17:04:10.000 [info] 2021/11/18 17:04:10 control: cancelMapSafely: channel was full
2021-11-18T17:04:10.000 [info] 2021/11/18 17:04:10 derphttp.Client.Connect: connecting to derp-3 (sin)
2021-11-18T17:04:10.000 [info] 2021/11/18 17:04:10 ipnserver: conn2: ReadMsg: EOF
2021-11-18T17:04:10.000 [info] 2021/11/18 17:04:10 LinkChange: minor
2021-11-18T17:04:10.000 [info] 2021/11/18 17:04:10 magicsock: ReSTUN: endpoint update active, need another later ("link-change-minor")
2021-11-18T17:04:10.000 [info] 2021/11/18 17:04:10 magicsock: derp-3 connected; connGen=1
2021-11-18T17:04:10.000 [info] 2021/11/18 17:04:10 health("overall"): ok
2021-11-18T17:04:10.000 [info] 2021/11/18 17:04:10 control: mapRoutine: new map request during PollNetMap. canceling.
2021-11-18T17:04:10.000 [info] 2021/11/18 17:04:10 control: PollNetMap: EOF
2021-11-18T17:04:10.000 [info] 2021/11/18 17:04:10 control: PollNetMap: stream=true :0 ep=[145.40.71.23:41641 172.19.0.122:41641 172.19.0.123:41641 [2604:1380:40e1:1a02:0:e398:71e1:1]:41641]
2021-11-18T17:04:11.000 [info] 2021/11/18 17:04:11 netcheck: report: udp=true v6=false mapvarydest=false hair=false portmap= v4a=145.40.71.23:41641 derp=3 derpdist=3v4:3ms,6v4:41ms,7v4:88ms
2021-11-18T17:04:11.000 [info] 2021/11/18 17:04:11 magicsock: starting endpoint update (link-change-minor)
2021-11-18T17:04:11.000 [info] 2021/11/18 17:04:11 [RATELIMIT] format("magicsock: starting endpoint update (%s)")
2021-11-18T17:04:11.000 [info] 2021/11/18 17:04:11 control: mapRoutine: netmap received: state:synchronized
2021-11-18T17:04:11.000 [info] 2021/11/18 17:04:11 control: sendStatus: mapRoutine-got-netmap: state:synchronized
2021-11-18T17:04:11.000 [info] 2021/11/18 17:04:11 netmap diff: (none)
2021-11-18T17:04:11.000 [info] 2021/11/18 17:04:11 netcheck: report: udp=true v6=false mapvarydest=false hair=false portmap= v4a=145.40.71.23:41641 derp=3 derpdist=3v4:1ms,6v4:41ms,7v4:88ms
2021-11-18T17:04:11.000 [info] 2021/11/18 17:04:11 control: HostInfo: {"IPNVersion":"1.16.2-tf44911179-g3853a3b92","BackendLogID":"89815942651e6726d9c12b276e21f6d89745158b4a28158026874c36600e0b97","OS":"linux","OSVersion":"Alpine Linux v3.14; kernel=5.12.2; env=fly","Hostname":"fly-sin","GoArch":"amd64","RoutableIPs":["0.0.0.0/0","::/0"],"Services":[{"Proto":"tcp","Port":22,"Description":"hallpass"},{"Proto":"tcp","Port":63725,"Description":"tailscaled"},{"Proto":"peerapi4","Port":63725},{"Proto":"peerapi6","Port":63725}],"NetInfo":{"MappingVariesByDestIP":false,"HairPinning":false,"WorkingIPv6":false,"WorkingUDP":true,"UPnP":false,"PMP":false,"PCP":false,"PreferredDERP":3,"DERPLatency":{"1-v4":0.239649821,"10-v4":0.208511041,"12-v4":0.256381204,"2-v4":0.203055821,"3-v4":0.001723592,"4-v4":0.180251099,"5-v4":0.094905633,"6-v4":0.041223971,"7-v4":0.087489326,"8-v4":0.153446486,"9-v4":0.207843529}}}
2021-11-18T17:04:11.000 [info] 2021/11/18 17:04:11 control: PollNetMap: stream=false :0 ep=[145.40.71.23:41641 172.19.0.122:41641 172.19.0.123:41641 [2604:138
0:40e1:1a02:0:e398:71e1:1]:41641]
2021-11-18T17:04:11.000 [info] 2021/11/18 17:04:11 [RATELIMIT] format("control: PollNetMap: stream=%v :%v ep=%v")
2021-11-18T17:04:11.000 [info] 2021/11/18 17:04:11 control: successful lite map update in 178ms
2021-11-18T17:04:11.000 [info] 2021/11/18 17:04:11 control: mapRoutine: netmap received: state:synchronized
2021-11-18T17:04:11.000 [info] 2021/11/18 17:04:11 control: sendStatus: mapRoutine-got-netmap: state:synchronized
2021-11-18T17:04:11.000 [info] 2021/11/18 17:04:11 [RATELIMIT] format("control: sendStatus: %s: %v")
2021-11-18T17:04:11.000 [info] 2021/11/18 17:04:11 netmap diff: (none)
2021-11-18T17:04:24.000 [warn] Health check status changed 'passing' => 'warning'
2021-11-18T17:04:40.000 [error] Health check status changed 'warning' => 'critical'
***v4 failed - Failed due to unhealthy allocations and deploying as v5

Troubleshooting guide at https://fly.io/docs/getting-started/troubleshooting/
Error abort

I had a look at the v4 instance’s logs too, but I am not sure what’s happening here:

➜  fly-tailscale-exit git:(main) ✗ fly status -a sparkling-snow-565 --all
App
  Name     = sparkling-snow-565
  Owner    = personal
  Version  = 4
  Status   = running
  Hostname = sparkling-snow-565.fly.dev

Deployment Status
  ID          = 08933acc-56c4-51ec-7b06-26ed7fca9f66
  Version     = v4
  Status      = failed
  Description = Failed due to unhealthy allocations
  Instances   = 1 desired, 1 placed, 0 healthy, 1 unhealthy

Instances
ID       PROCESS VERSION REGION DESIRED STATUS   HEALTH CHECKS       RESTARTS CREATED
e39871e1 app     4 ⇡     sin    stop    complete 1 total, 1 critical 0        6m39s ago
23aeb43d app     3       sin    run     running                      0        9m50s ago
b082b691 app     2       sin    stop    complete                     0        13m0s ago
ea3fcde1 app     1       sin    stop    complete                     0        2h13m ago
59a3e9cb app     0       sin    stop    complete                     0        2h19m ago

➜  fly-tailscale-exit git:(main) ✗ fly logs -a sparkling-snow-565 -i e39871e1
.
.
.
.
2021-11-18T17:04:11.460 app[e39871e1] sin [info] 2021/11/18 17:04:11 control: HostInfo: {"IPNVersion":"1.16.2-tf44911179-g3853a3b92","BackendLogID":"89815942651e6726d9c12b276e21f6d89745158b4a28158026874c36600e0b97","OS":"linux","OSVersion":"Alpine Linux v3.14; kernel=5.12.2; env=fly","Hostname":"fly-sin","GoArch":"amd64","RoutableIPs":["0.0.0.0/0","::/0"],"Services":[{"Proto":"tcp","Port":22,"Description":"hallpass"},{"Proto":"tcp","Port":63725,"Description":"tailscaled"},{"Proto":"peerapi4","Port":63725},{"Proto":"peerapi6","Port":63725}],"NetInfo":{"MappingVariesByDestIP":false,"HairPinning":false,"WorkingIPv6":false,"WorkingUDP":true,"UPnP":false,"PMP":false,"PCP":false,"PreferredDERP":3,"DERPLatency":{"1-v4":0.239649821,"10-v4":0.208511041,"12-v4":0.256381204,"2-v4":0.203055821,"3-v4":0.001723592,"4-v4":0.180251099,"5-v4":0.094905633,"6-v4":0.041223971,"7-v4":0.087489326,"8-v4":0.153446486,"9-v4":0.207843529}}}
2021-11-18T17:04:11.460 app[e39871e1] sin [info] 2021/11/18 17:04:11 control: PollNetMap: stream=false :0 ep=[145.40.71.23:41641 172.19.0.122:41641 172.19.0.123:41641 [2604:1380:40e1:1a02:0:e398:71e1:1]:41641]
2021-11-18T17:04:11.460 app[e39871e1] sin [info] 2021/11/18 17:04:11 [RATELIMIT] format("control: PollNetMap: stream=%v :%v ep=%v")
2021-11-18T17:04:11.638 app[e39871e1] sin [info] 2021/11/18 17:04:11 control: successful lite map update in 178ms
2021-11-18T17:04:11.639 app[e39871e1] sin [info] 2021/11/18 17:04:11 control: mapRoutine: netmap received: state:synchronized
2021-11-18T17:04:11.639 app[e39871e1] sin [info] 2021/11/18 17:04:11 control: sendStatus: mapRoutine-got-netmap: state:synchronized
2021-11-18T17:04:11.640 app[e39871e1] sin [info] 2021/11/18 17:04:11 [RATELIMIT] format("control: sendStatus: %s: %v")
2021-11-18T17:04:11.640 app[e39871e1] sin [info] 2021/11/18 17:04:11 netmap diff: (none)
2021-11-18T17:04:24.300 proxy[e39871e1] sin [warn] Health check status changed 'passing' => 'warning'
2021-11-18T17:04:40.213 proxy[e39871e1] sin [error] Health check status changed 'warning' => 'critical'
2021-11-18T17:09:11.113 runner[e39871e1] sin [info] Shutting down virtual machine

The key difference between the Fli-hole guide and my Dockerfile is the interface change. Fli-hole listens on eth0 but I updated that to tailscale0.

Does anyone have any ideas on why this might be failing? Thanks for taking a look!

Could you try a fly checks list as well? It looks like there’s a health check on the app although I don’t see one on the fly.toml.

I am seeing this:

➜  fly-tailscale-exit git:(main) ✗ fly checks list
Health Checks for sparkling-snow-565
NAME                             STATUS  ALLOCATION REGION TYPE LAST UPDATED OUTPUT
bed96a2ee34630fd843bdaf1b39b7990 warning 58405d0a   sin    TCP  8s ago

➜  fly-tailscale-exit git:(main) ✗ fly checks list
Health Checks for sparkling-snow-565
NAME                             STATUS   ALLOCATION REGION TYPE LAST UPDATED OUTPUT
bed96a2ee34630fd843bdaf1b39b7990 critical 58405d0a   sin    TCP  5s ago       dial tcp 172.19.2.162:80:
                                                                              connect: connection refused

It eventually fails with a unhealthy allocations message, same as earlier:

.
.
.
==> Pushing image to fly
The push refers to repository [registry.fly.io/sparkling-snow-565]
8b9c1be36dd5: Layer already exists
6222eb63e256: Layer already exists
731bf69a19de: Layer already exists
01543b960e98: Layer already exists
a403cb938d74: Layer already exists
919745deef4d: Layer already exists
a851936b4701: Layer already exists
1a058d5342cc: Layer already exists
deployment-1637259989: digest: sha256:550b1cb26c2e5938a300f96e2dc01908879684ad579a465bb2c7ce4e42d85202 size: 1989
--> Pushing image done
Image: registry.fly.io/sparkling-snow-565:deployment-1637259989
Image size: 40 MB
==> Creating release
Release v7 created

You can detach the terminal anytime without stopping the deployment
Monitoring Deployment

1 desired, 1 placed, 0 healthy, 1 unhealthy
v7 failed - Failed due to unhealthy allocations
***v7 failed - Failed due to unhealthy allocations and deploying as v8

Troubleshooting guide at https://fly.io/docs/getting-started/troubleshooting/
Error abort

➜  fly-tailscale-exit git:(main) ✗ fly checks list
Health Checks for sparkling-snow-565
NAME                             STATUS   ALLOCATION REGION TYPE LAST UPDATED OUTPUT
bed96a2ee34630fd843bdaf1b39b7990 critical 58405d0a   sin    TCP  4m59s ago    dial tcp 172.19.2.162:80:
                                                                              connect: connection refused

It looks like there was a TCP check present earlier — it should be possible to remove it by adding an empty [[services.tcp_checks]] under [[services]]. If the key is missing the CLI ignores the array at this point, as opposed to actively zeroing it.

Out of curiosity, though, is it your intention to expose port 41641? If you put in an empty services block you’ll be able to reach it from your internal network. The services section is for making ports publicly accessible.

If you do decide to drop the services, maybe deploy once with the empty tcp_checks array so the existing checks are all cancelled? After that you should be able to drop the section completely from your config.

Thanks for sharing that! I haven’t tried that so far because I have taken a different approach: install pihole based on the official guide and then SSH’d into the instance to add Tailscale. This works well for me. pihole is not listening only on the Tailscale network.

Hi Arun,

I follow your method to install pihole and then add tailscale to the instance via SSH, but I couldn’t start tailscale with tailscale up, because it requires systemctl and systemd installed. AFAIK systemd can’t be installed inside a container.
How did you manage to run tailscale inside the pihole instance?

I had to use legacy iptables and then run ./tailscaled under usr/sbin folder. From there, I could run sudo tailscale up.

Stopping ./tailscaled stops Tailscale, so as a workaround for now, I just close the tab where ./tailscaled is running.

I am pretty sure that’s not how I must be doing it, but it works for now.

2 Likes

Thank you very much for the guidance! Finally I manage to run them both in one instance. If only fly.io had something like docker0 interface, the connection of both apps would be easy.

Cheers!

1 Like