Postgres database stop suddenly (dead) in my Rails App

My postgres database was suddenly changed to dead status. I run the following command to activate it:

fly scale count 1 -a mywarehouse-db

Now the status of the database is running, but in the logs shows errors:

2023-02-17T04:10:06.087 app[4d47e1a4] scl [info] sentinel | 2023-02-17T04:10:06.086Z WARN cmd/sentinel.go:276 no keeper info available {"db": "92168189", "keeper": "d33b28ceb2"}

2023-02-17T04:10:06.091 app[4d47e1a4] scl [info] sentinel | 2023-02-17T04:10:06.091Z ERROR cmd/sentinel.go:1018 no eligible masters

2023-02-17T04:10:13.361 app[4d47e1a4] scl [info] sentinel | 2023-02-17T04:10:13.361Z WARN cmd/sentinel.go:276 no keeper info available {"db": "92168189", "keeper": "d33b28ceb2"}

2023-02-17T04:10:13.366 app[4d47e1a4] scl [info] sentinel | 2023-02-17T04:10:13.365Z ERROR cmd/sentinel.go:1018 no eligible masters

2023-02-17T04:10:15.100 app[4d47e1a4] scl [info] exporter | INFO[3446] Established new database connection to "fdaa:0:c785:a7b:d33d:2:8cec:2:5433". source="postgres_exporter.go:970"

2023-02-17T04:10:16.101 app[4d47e1a4] scl [info] exporter | ERRO[3447] Error opening connection to database (postgresql://flypgadmin:PASSWORD_REMOVED@[fdaa:0:c785:a7b:d33d:2:8cec:2]:5433/postgres?sslmode=disable): dial tcp [fdaa:0:c785:a7b:d33d:2:8cec:2]:5433: connect: connection refused source="postgres_exporter.go:1658"

2023-02-17T04:10:20.659 app[4d47e1a4] scl [info] sentinel | 2023-02-17T04:10:20.658Z WARN cmd/sentinel.go:276 no keeper info available {"db": "92168189", "keeper": "d33b28ceb2"}

2023-02-17T04:10:20.663 app[4d47e1a4] scl [info] sentinel | 2023-02-17T04:10:20.663Z ERROR cmd/sentinel.go:1018 no eligible masters

2023-02-17T04:10:28.302 app[4d47e1a4] scl [info] sentinel | 2023-02-17T04:10:28.301Z WARN cmd/sentinel.go:276 no keeper info available {"db": "92168189", "keeper": "d33b28ceb2"}

2023-02-17T04:10:28.306 app[4d47e1a4] scl [info] sentinel | 2023-02-17T04:10:28.305Z ERROR cmd/sentinel.go:1018 no eligible masters

2023-02-17T04:10:30.097 app[4d47e1a4] scl [info] exporter | INFO[3461] Established new database connection to "fdaa:0:c785:a7b:d33d:2:8cec:2:5433". source="postgres_exporter.go:970"

2023-02-17T04:10:31.099 app[4d47e1a4] scl [info] exporter | ERRO[3462] Error opening connection to database (postgresql://flypgadmin:PASSWORD_REMOVED@[fdaa:0:c785:a7b:d33d:2:8cec:2]:5433/postgres?sslmode=disable): dial tcp [fdaa:0:c785:a7b:d33d:2:8cec:2]:5433: connect: connection refused source="postgres_exporter.go:1658"

I tried to restart the database with this command:

flyctl postgres restart -a mywarehouse-db

But it fail and show the next message:

Error can't get role for 0ea7d1de-1dd3-1e7b-36eb: 500: context deadline exceeded

Hope someone can help me. This is a database in production where I need to save the data and fix the issue asap.

It is probably related to Postgres app fails on restart/deployment.

Are you able to connect to your DB application console?

fly ssh console -a <some-application-db>

You can try to retrieve your data then:

You can try snapshots if it works for you.

1 Like

context deadline exceeded could be a problem with your network. Check this out:

If @angordeyev’s suggestion does not work. Could you give more information please?

What’s the:

  • status of the db: flyctl status -a <postgres-app>;
  • how are the health checks? flyctl checks list -a <postgres-app>; and
  • exact details of any instances that are running flyctl status instance <id> -a <postgres-app>.