2 database instances but none showing in fly status

Hi,

My app seems to have 2 database instances running but neither are showing under fly status.

I think both are fighting to be master.

Could someone take a look?

Thank you!

03-21T10:37:51.573 app[caaa8a5e] lhr [info] keeper | 2022-03-21T10:37:51.573Z INFO cmd/keeper.go:1542 already master
2022-03-21T10:37:51.603 app[caaa8a5e] lhr [info] keeper | 2022-03-21T10:37:51.603Z INFO cmd/keeper.go:1675 postgres parameters not changed
2022-03-21T10:37:51.604 app[caaa8a5e] lhr [info] keeper | 2022-03-21T10:37:51.603Z INFO cmd/keeper.go:1702 postgres hba entries not changed
2022-03-21T10:37:53.672 app[14acb1db] lhr [info] keeper | 2022-03-21T10:37:53.671Z INFO cmd/keeper.go:1556 our db requested role is standby {"followedDB": "dae43dfa"}
2022-03-21T10:37:53.672 app[14acb1db] lhr [info] keeper | 2022-03-21T10:37:53.671Z INFO cmd/keeper.go:1575 already standby
2022-03-21T10:37:53.723 app[14acb1db] lhr [info] keeper | 2022-03-21T10:37:53.723Z INFO cmd/keeper.go:1675 postgres parameters not changed
2022-03-21T10:37:53.723 app[14acb1db] lhr [info] keeper | 2022-03-21T10:37:53.723Z INFO cmd/keeper.go:1702 postgres hba entries not changed
2022-03-21T10:37:56.686 app[caaa8a5e] lhr [info] keeper | 2022-03-21T10:37:56.686Z INFO cmd/keeper.go:1675 postgres parameters not changed
2022-03-21T10:37:56.687 app[caaa8a5e] lhr [info] keeper | 2022-03-21T10:37:56.686Z INFO cmd/keeper.go:1702 postgres hba entries not changed
2022-03-21T10:37:58.881 app[14acb1db] lhr [info] keeper | 2022-03-21T10:37:58.880Z INFO cmd/keeper.go:1575 already standby
2022-03-21T10:37:58.948 app[14acb1db] lhr [info] keeper | 2022-03-21T10:37:58.948Z INFO cmd/keeper.go:1675 postgres parameters not changed

I’ve also somehow got 2 volumes running. How do I know which one I can safely delete?

Hmm, perhaps this is right? Sentry started freaking out and none of this set up makes sense to me :sweat_smile:

Could someone from Fly take a look and let me know? :pray:

Hi @philipbrown, until someone more expert arrives, can I ask: does your database app show up with fly status -a <your-pg-app-name>?

Each VM needs its own volume, so it makes sense to have two.

Hey Chris!

The App table has details, but there are no instances listed.

Whereas if I do fly status -a <app-name> there is a list of instances.

Is that correct?

Ah! I think I get you! Maybe!

I think fly status actually has stopped showing instances! I upgraded my flyctl today and went from seeing them to not. I don’t think this is intentional; I’ll check with people.

You should be able to see them in the web UI at https://fly.io/apps/your-app though.

(The noisy logs I think are OK, though; Stolon is always checking if things are cool, in case it has to do a leadership change, and talking about it in the logs.)

Yeah, I can see them in the web UI.

Ok, that’s good! So everything is set up correctly?

Are you able to check what happened that caused the flood of database errors that caused my initial panic? :sweat_smile:

I’m not sure which errors you mean. The logs you posted above (with [info] keeper) look to me like Stolon doing its thing. Do you have other reason to think things aren’t as they should be?

BTW fly status should be showing instances again.

Here are the errors from Sentry:

(Sentry.CrashError ** (exit) exited in: GenServer.call(#PID<0.24656.0>, {:listen, [:leader]}, 5000)
    ** (EXIT) time out)
(DBConnection.ConnectionError tcp connect (top2.nearest.of.my-app-db.internal:5432): non-existing domain - :nxdomain)

That :nxdomain error means the DNS lookup for that database wasn’t working when it tried to connect. Can you explain the timeline here? Your app was running, using the DB, then you got a flood of sentry errors yesterday?

Yeah, I re-created the database for the app on the 18th. Yesterday I started inviting users to the app, and around the same time I got the errors through from Sentry and started to panic that I’d set up the database incorrectly.

So the database had been happily running from the 18th to the 21st.