unspecified DNS error

Hi I have a Django app that connects to a PostGres database and my Django app just does not work anymore. It gives me the operational error: could not translate host name “top2.nearest.of.[DATABASE NAME].internal” to address: No address associated with hostname

I looked at the logs of my database and it says the following:
Server bk_db/pg1 is going DOWN for maintenance (unspecified DNS error). 0 active and 1 backup servers left. Running on backup. 0 sessions active, 0 requeued, 0 remaining in queue.

Is there anything I can do or do I need to wait this ‘maintenance’ out? Why is this happening actually?

I am having the same issue

2023-02-08T13:13:40Z app[32871edf415485] ams [info]proxy    | [WARNING] 038/131340 (562) : Server bk_db/pg1 is going DOWN for maintenance (unspecified DNS error). 0 active and 1 backup servers left. Running on backup. 0 sessions active, 0 requeued, 0 remaining in queue.```

We’re seeing the exact same issue, “top2.nearest.of” doesn’t resolve nor does “[region].[app_name]”.

1 Like

fly pg restart -a <app name> seems to have solved it for me.

Similar issues for us

[WARNING] 038/160203 (577) : Backup Server bk_db/pg is DOWN, reason: Layer7 timeout, check duration: 5000ms. 2 active and 0 backup servers left. 0 sessions active, 0 requeued, 0 remaining in queue.

My understanding is that a Layer7 timeout is “Tried to do a TCP handshake, never got a response”. So the good news is DNS seems to be resolving now, but no additional steps seem to be traversed…

SSHing into the box reveals the database appears to be in good spirits - we can query the DB through psql -U <username> on the CLI. Hopefully this is a network issue fly will fix on their end?

Restart worked for one of our apps, but now another is having the same issues and is unable to restart and we’re not able to connect, even using the direct [app_name].internal address.

Doing this gave me an error that there were no VMs available. Do you know if there is something to do about that?

correction it says “failed to list VMs”

Hi @kaiden-exe, it looks like your database is on a host that experienced a hardware failure earlier today. We’ve been working to restore it. Running fly pg restart -a <app_name> should fix your DNS issues. I would try the command again to see if you can now restart the instance.

When it comes back, I would also recommend adding a second Postgres node to your cluster. Since we host each postgres instance and volume on separate hosts this is the best way to ensure you’re resilient to hardware issues. You can find instructions on how to do so here: High Availability & Global Replication · Fly Docs

I could not restart it while the issue was there, but I just opened the website and everything works without restarting. Thanks for the reply!

I recently launch this app for testing. I am new to fly.io

I am also getting this DNS error when both visiting the site or trying to deploy a change.

restarting the pg does not work…

mathieu:go_out_nearby :-( 1 (main) $ fly checks list -a go-out-nearby
Health Checks for go-out-nearby
  NAME                             | STATUS  | ALLOCATION | REGION | TYPE | LAST UPDATED         | OUTPUT                                     
-----------------------------------*---------*------------*--------*------*----------------------*--------------------------------------------
  3df2415693844068640885b45074b954 | passing | c1ba19a2   | nrt    | TCP  | 2023-02-03T19:11:51Z | TCP connect 172.19.6.210:8080: Success[✓]  
                                   |         |            |        |      |                      |                                            
                                   |         |            |        |      |                      |                                            
mathieu:go_out_nearby :-) (main) $ fly pg restart -a go-out-nearby
Error app go-out-nearby is not a postgres app


mathieu:go_out_nearby :-( 1 (main) $ fly pg restart -a go-out-nearby-db
Error no active leader found


mathieu:go_out_nearby :-( 1 (main) $ fly checks list -a go-out-nearby-db
Health Checks for go-out-nearby-db
  NAME | STATUS  | MACHINE        | LAST UPDATED | OUTPUT                     
-------*---------*----------------*--------------*----------------------------
  pg   | warning | 1781775c922989 | 0s           | waiting for status update  
-------*---------*----------------*--------------*----------------------------
  role | warning | 1781775c922989 | 0s           | waiting for status update  
-------*---------*----------------*--------------*----------------------------
  vm   | warning | 1781775c922989 | just now     | waiting for status update  
-------*---------*----------------*--------------*----------------------------
mathieu:go_out_nearby :-) (main) $ 

Update: I think I found my answer…

TL:DR - pg is crashed and not managed… so I need to fix myself… OMG

Hi, same issue for me.
Postgres is happily working on machine (psql -h localhost -U postgres works), but fly proxy -a %app%-db gives

Error %app%-db.internal: host was not found in DNS

In logs the most suspicious line is

proxy e[0m | [WARNING] 067/163706 (594) : Server bk_db/pg1 is going DOWN for maintenance (unspecified DNS error). 0 active and 1 backup servers left. Running on backup. 0 sessions active, 0 requeued, 0 remaining in queue.

However, I can not restart it. fly apps restart %app%-db gives

Error postgres apps should use fly pg restart instead`

but fly pg restart -a %app%-db returns

Error no active leader found

, and, finally, fly machine restart -a %app%-db %database-vm-id% returns

Error failed to restart machine %database-vm-id%: could not stop machine %database-vm-id%: failed to restart VM %database-vm-id%: machine ID %database-vm-id% lease currently held by %my email%

I’m kinda out of options, DB is healthy, but I don’t understand how to connect app to it. (I get UnknownHostException)


UPD: fly machine stop and fly machine start instead of restart seemed to help. But maybe it was something else that fixed it, I don’t know.