Postgres app .internal hostname not resolving

One of our postgres DBs went offline today. After some exploration I discovered that I can still connect to it directly using its private IPv6 address, but its .internal hostname is not resolving. I also noticed that it’s down to a single instance (not sure what happened to the replica) and there are a bunch of sentinel warnings in the logs.

Status

App
  Name     = treefort-db-stg
  Owner    = treefort
  Version  = 7
  Status   = running
  Hostname = treefort-db-stg.fly.dev

Instances
ID       PROCESS VERSION REGION DESIRED STATUS           HEALTH CHECKS      RESTARTS CREATED
d2c87e00 app     7       sea    run     running (leader) 3 total, 3 passing 1        1h17m ago

Logs

2022-02-26T00:45:18.397 app[d2c87e00] sea [info] keeper   | 2022-02-26T00:45:18.396Z	INFO	cmd/keeper.go:1505	our db requested role is master
2022-02-26T00:45:18.398 app[d2c87e00] sea [info] keeper   | 2022-02-26T00:45:18.397Z	INFO	cmd/keeper.go:1543	already master
2022-02-26T00:45:18.414 app[d2c87e00] sea [info] keeper   | 2022-02-26T00:45:18.413Z	INFO	cmd/keeper.go:1676	postgres parameters not changed
2022-02-26T00:45:18.414 app[d2c87e00] sea [info] keeper   | 2022-02-26T00:45:18.414Z	INFO	cmd/keeper.go:1703	postgres hba entries not changed
2022-02-26T00:45:19.571 app[d2c87e00] sea [info] sentinel | 2022-02-26T00:45:19.570Z	WARN	cmd/sentinel.go:276	no keeper info available	{"db": "f1dce1cc", "keeper": "12de06e922"}
2022-02-26T00:45:19.571 app[d2c87e00] sea [info] sentinel | 2022-02-26T00:45:19.571Z	WARN	cmd/sentinel.go:276	no keeper info available	{"db": "59f4d4eb", "keeper": "ab806e912"}
2022-02-26T00:45:19.571 app[d2c87e00] sea [info] sentinel | 2022-02-26T00:45:19.571Z	WARN	cmd/sentinel.go:276	no keeper info available	{"db": "98edbff6", "keeper": "ac203d352"}
2022-02-26T00:45:20.667 app[d2c87e00] sea [info] proxy    | 2022-02-26T00:45:20.667Z	INFO	cmd/proxy.go:268	master address	{"address": "[fdaa:0:22f0:a7b:2c60:0:9c2d:2]:5433"}
2022-02-26T00:45:20.859 app[d2c87e00] sea [info] proxy    | 2022-02-26T00:45:20.859Z	INFO	cmd/proxy.go:286	proxying to master address	{"address": "[fdaa:0:22f0:a7b:2c60:0:9c2d:2]:5433"}
2022-02-26T00:45:23.517 app[d2c87e00] sea [info] keeper   | 2022-02-26T00:45:23.516Z	INFO	cmd/keeper.go:1505	our db requested role is master
2022-02-26T00:45:23.518 app[d2c87e00] sea [info] keeper   | 2022-02-26T00:45:23.517Z	INFO	cmd/keeper.go:1543	already master
2022-02-26T00:45:23.532 app[d2c87e00] sea [info] keeper   | 2022-02-26T00:45:23.532Z	INFO	cmd/keeper.go:1676	postgres parameters not changed
2022-02-26T00:45:23.533 app[d2c87e00] sea [info] keeper   | 2022-02-26T00:45:23.532Z	INFO	cmd/keeper.go:1703	postgres hba entries not changed
2022-02-26T00:45:25.126 app[d2c87e00] sea [info] sentinel | 2022-02-26T00:45:25.126Z	WARN	cmd/sentinel.go:276	no keeper info available	{"db": "98edbff6", "keeper": "ac203d352"}
2022-02-26T00:45:25.127 app[d2c87e00] sea [info] sentinel | 2022-02-26T00:45:25.126Z	WARN	cmd/sentinel.go:276	no keeper info available	{"db": "f1dce1cc", "keeper": "12de06e922"}
2022-02-26T00:45:25.127 app[d2c87e00] sea [info] sentinel | 2022-02-26T00:45:25.126Z	WARN	cmd/sentinel.go:276	no keeper info available	{"db": "59f4d4eb", "keeper": "ab806e912"}
2022-02-26T00:45:25.930 app[d2c87e00] sea [info] proxy    | 2022-02-26T00:45:25.930Z	INFO	cmd/proxy.go:268	master address	{"address": "[fdaa:0:22f0:a7b:2c60:0:9c2d:2]:5433"}
2022-02-26T00:45:26.127 app[d2c87e00] sea [info] proxy    | 2022-02-26T00:45:26.126Z	INFO	cmd/proxy.go:286	proxying to master address	{"address": "[fdaa:0:22f0:a7b:2c60:0:9c2d:2]:5433"}
2022-02-26T00:45:30.671 app[d2c87e00] sea [info] sentinel | 2022-02-26T00:45:30.670Z	WARN	cmd/sentinel.go:276	no keeper info available	{"db": "98edbff6", "keeper": "ac203d352"}
2022-02-26T00:45:30.671 app[d2c87e00] sea [info] sentinel | 2022-02-26T00:45:30.670Z	WARN	cmd/sentinel.go:276	no keeper info available	{"db": "98edbff6", "keeper": "ac203d352"}
2022-02-26T00:45:30.671 app[d2c87e00] sea [info] sentinel | 2022-02-26T00:45:30.671Z	WARN	cmd/sentinel.go:276	no keeper info available	{"db": "f1dce1cc", "keeper": "12de06e922"}
2022-02-26T00:45:31.197 app[d2c87e00] sea [info] proxy    | 2022-02-26T00:45:31.196Z	INFO	cmd/proxy.go:268	master address	{"address": "[fdaa:0:22f0:a7b:2c60:0:9c2d:2]:5433"}
2022-02-26T00:45:31.388 app[d2c87e00] sea [info] proxy    | 2022-02-26T00:45:31.388Z	INFO	cmd/proxy.go:286	proxying to master address	{"address": "[fdaa:0:22f0:a7b:2c60:0:9c2d:2]:5433"

We had to migrate hardware in Seattle yesterday. For some reason one of your VMs didn’t start after your volumes migrated. I just brought it back so you should have two now. The warnings just say that a node that was part of the cluster is no longer available, which is normal when hosts change. The error will go away after ~24 hours when it’s garbage collected in stolen.

Are you still having trouble resolving the internal address?

Address is resolving now, thanks!

@elliotdickison We think there was a stale IP address in DNS for your postgres. If you run into this again, will you run fly dig aaaa <db>.internal and see if that matches up to fly ips private?

Will do!