Postgre App looping with error : exit status 1

This week my first db stopped working, when i try to do some “upgrade” or “restart” on the db, i got this error :

Error no active leader found

Then i tried to create a new db from a snapshot of the first db. But i got this error when i started to deploy an image to get external connection :

2023-02-24T07:21:52.649 app[148ed51b739328] cdg [info] proxy | [WARNING] 054/072152 (2971) : parsing [/fly/haproxy.cfg:38]: Missing LF on last line, file might have been truncated at position 96. This will become a hard error in HAProxy 2.3.

2023-02-24T07:21:52.649 app[148ed51b739328] cdg [info] proxy | [ALERT] 054/072152 (2971) : Error(s) found in configuration file : /fly/haproxy.cfg

2023-02-24T07:21:52.650 app[148ed51b739328] cdg [info] proxy | [ALERT] 054/072152 (2971) : Fatal errors found in configuration.

2023-02-24T07:21:52.651 app[148ed51b739328] cdg [info] proxy | exit status 1

2023-02-24T07:21:52.651 app[148ed51b739328] cdg [info] proxy | restarting in 1s [attempt 379]

2023-02-24T07:21:53.653 app[148ed51b739328] cdg [info] proxy | Running...

2023-02-24T07:21:53.663 app[148ed51b739328] cdg [info] proxy | exit status 1

2023-02-24T07:21:53.663 app[148ed51b739328] cdg [info] proxy | restarting in 1s [attempt 380]

2023-02-24T07:21:54.664 app[148ed51b739328] cdg [info] proxy | Running...

2023-02-24T07:21:54.683 app[148ed51b739328] cdg [info] proxy | [NOTICE] 054/072154 (2991) : haproxy version is 2.2.9-2+deb11u3

2023-02-24T07:21:54.683 app[148ed51b739328] cdg [info] proxy | [NOTICE] 054/072154 (2991) : path to executable is /usr/sbin/haproxy

2023-02-24T07:21:54.683 app[148ed51b739328] cdg [info] proxy | [ALERT] 054/072154 (2991) : parsing [/fly/haproxy.cfg:37] : Can't create DNS resolution for server '(null)'

Does anyone have some piece of information to solve the problem of the first or the second db ?

1 Like

Is it possible to check if restores to this newer flex Postgres Fly app works?

# ref: fly.io/docs/postgres/managing/backup-and-restore
flyctl pg create --snapshot-id <sid> -a <app-name> --flex
1 Like
flyctl pg create --snapshot-id <sid> -a <app-name> --flex

This unfortunately will not work. You will need to perform a pg dump/restore to move to flex as of right now. We are actively working to make this transition process easier for users.

2 Likes
  • flyctl status --all -a mydb-app
ID              STATE   ROLE    REGION  HEALTH CHECKS   IMAGE                           CREATED                 UPDATED              
9080291c6d3dd8  started error   cdg     3 total         flyio/postgres:14.6 (v0.0.34)   2023-01-08T12:44:20Z    2023-02-19T12:45:56Z
  • doesn’t work (unk,ow, shorthand flag: ‘a’ in -a
    But when i do the first pg create of my snapshot with : fly postgres create --snapshot-id “snapshot-id”
    it was successfull :
... Choose app name, organization...
Waiting for 3d8d463c765389 to become healthy (started, 3/3)
1 Like

flyctl scale count 1 -a app_name

Error it looks like your app is running on v2 of our platform, and does not support this legacy command```

@Musto

I took a look at your app and looks like your standby does not have a volume attached to it. I would remove that machine using:

fly machines stop <machine-id> --app <app-name>

fly machines remove <machine-id> --app <app-name>

Then create a new one using:

fly machines clone <primarys-machine-id>

Let me know how that goes.

2 Likes

Hey thanks for your help !

fly machines stop <908..(machine-id)> -a <app-name>
//Success
fly machines remove <908..(machine-id)> -a <app-name>
//Success
fly machines clone <908..(machine-id)>
//Could not find app
fly machines clone<908..(machine-id)> -a <app-name>
//Success
/*
Cloning machine <908..(machine-id)> into region cdg
Volume 'pg_data' will start empty
Provisioning a new machine with image registry-1.docker.io/flyio...
Machine <148(new-machine-id)> has been created
Waiting for start and to become healty... (1/3)
Machine has been successfully cloned!
*/

On my app (after restart) :

strapi start

2023-02-25T19:54:09.626 app[84..] cdg [info] [2023-02-25 19:54:09.624] debug: ⛔️ Server wasn't able to start properly.

2023-02-25T19:54:09.627 app[84..] cdg [info] [2023-02-25 19:54:09.626] error: Connection terminated unexpectedly

2023-02-25T19:54:09.627 app[84..] cdg [info] Error: Connection terminated unexpectedly

On my db logs :

exporter | INFO[0843] Established new database connection to "fdaa:...". source="postgres_exporter.go:970"

2023-02-25T19:54:57.048 app[148..] cdg [info] exporter | ERRO[0844] Error opening connection to database (postgresql://flypgadmin:PASS@[fdaa:...]:5433/postgres?sslmode=disable): dial tcp [fdaa:...]:5433: connect: connection refused source="postgres_exporter.go:1658"

2023-02-25T19:54:58.151 app[148..] cdg [info] sentinel | 2023-02-25T19:54:58.151Z WARN cmd/sentinel.go:276 no keeper info available {"db": "eb...", "keeper": "5ad..."}

2023-02-25T19:54:58.155 app[148..] cdg [info] sentinel | 2023-02-25T19:54:58.155Z ERROR cmd/sentinel.go:1018 no eligible masters

Strange… Looks like the volume didn’t get created.

I went ahead and created a new volume for you:

fly volumes create pg_data --size 1 --region cdg

You can view you volumes by running:

fly volumes list

Then I performed the clone command as follows:

fly machines clone <primary-machine-id> --attach-volume <new-volume-id>

That seemed to do the trick.

1 Like

Be careful, i was modifying the first db (portfo…-api-m…-db) and not the new one (m…-db)

Is it easier to change my app connection to the new db or to solve the problem with the first db ?

Is it easier to change my app connection to the new db or to solve the problem with the first db ?

You should be able to fix your api db by doing something like:

fly volumes list --app <app-name>

Take note of the volume that doesn’t have an attached VM. This holds your primary’s data.

fly machines clone <existing-machine-id> --attach-volume <unallocated-volume-id> --app <app-name>

Out of curiosity, which version of flyctl are you running?

1 Like

If you don’t have any data in these dbs yet, I would update your flyctl version and re-provision them. You’ll then be on the latest implementation of our Postgres offering:

1 Like

Flyctl version : v0.0.464

I executed the clone of my existing machine with the unallocated volume :

Machine has been successfully cloned!

And it works !

Thank you very much !

Do you know what was the problem? It works perfectly fine for weeks and suddenly this problem appears. It will help many people I think.

My db have a lot of essential data, but thank you for the second option

Hard to say. If you ever experience anything weird though, the best first step is to make sure you’re running the latest version of flyctl. If the weirdness still exists after the upgrade, then it’s at least easier for us to troubleshoot.

Happy to hear things are back in order though!

1 Like