2 database instances but none showing in fly status

philipbrown · March 21, 2022, 10:33am

Hi,

My app seems to have 2 database instances running but neither are showing under fly status.

I think both are fighting to be master.

Could someone take a look?

Thank you!

philipbrown · March 21, 2022, 10:38am

03-21T10:37:51.573 app[caaa8a5e] lhr [info] keeper | 2022-03-21T10:37:51.573Z INFO cmd/keeper.go:1542 already master
2022-03-21T10:37:51.603 app[caaa8a5e] lhr [info] keeper | 2022-03-21T10:37:51.603Z INFO cmd/keeper.go:1675 postgres parameters not changed
2022-03-21T10:37:51.604 app[caaa8a5e] lhr [info] keeper | 2022-03-21T10:37:51.603Z INFO cmd/keeper.go:1702 postgres hba entries not changed
2022-03-21T10:37:53.672 app[14acb1db] lhr [info] keeper | 2022-03-21T10:37:53.671Z INFO cmd/keeper.go:1556 our db requested role is standby {"followedDB": "dae43dfa"}
2022-03-21T10:37:53.672 app[14acb1db] lhr [info] keeper | 2022-03-21T10:37:53.671Z INFO cmd/keeper.go:1575 already standby
2022-03-21T10:37:53.723 app[14acb1db] lhr [info] keeper | 2022-03-21T10:37:53.723Z INFO cmd/keeper.go:1675 postgres parameters not changed
2022-03-21T10:37:53.723 app[14acb1db] lhr [info] keeper | 2022-03-21T10:37:53.723Z INFO cmd/keeper.go:1702 postgres hba entries not changed
2022-03-21T10:37:56.686 app[caaa8a5e] lhr [info] keeper | 2022-03-21T10:37:56.686Z INFO cmd/keeper.go:1675 postgres parameters not changed
2022-03-21T10:37:56.687 app[caaa8a5e] lhr [info] keeper | 2022-03-21T10:37:56.686Z INFO cmd/keeper.go:1702 postgres hba entries not changed
2022-03-21T10:37:58.881 app[14acb1db] lhr [info] keeper | 2022-03-21T10:37:58.880Z INFO cmd/keeper.go:1575 already standby
2022-03-21T10:37:58.948 app[14acb1db] lhr [info] keeper | 2022-03-21T10:37:58.948Z INFO cmd/keeper.go:1675 postgres parameters not changed

philipbrown · March 21, 2022, 10:47am

I’ve also somehow got 2 volumes running. How do I know which one I can safely delete?

philipbrown · March 21, 2022, 10:53am

Hmm, perhaps this is right? Sentry started freaking out and none of this set up makes sense to me

Could someone from Fly take a look and let me know?

catflydotio · March 21, 2022, 12:54pm

Hi @philipbrown, until someone more expert arrives, can I ask: does your database app show up with fly status -a <your-pg-app-name>?

Each VM needs its own volume, so it makes sense to have two.

philipbrown · March 21, 2022, 12:57pm

Hey Chris!

The App table has details, but there are no instances listed.

Whereas if I do fly status -a <app-name> there is a list of instances.

Is that correct?

catflydotio · March 21, 2022, 2:48pm

Ah! I think I get you! Maybe!

I think fly status actually has stopped showing instances! I upgraded my flyctl today and went from seeing them to not. I don’t think this is intentional; I’ll check with people.

You should be able to see them in the web UI at https://fly.io/apps/your-app though.

(The noisy logs I think are OK, though; Stolon is always checking if things are cool, in case it has to do a leadership change, and talking about it in the logs.)

philipbrown · March 21, 2022, 3:11pm

Yeah, I can see them in the web UI.

Ok, that’s good! So everything is set up correctly?

Are you able to check what happened that caused the flood of database errors that caused my initial panic?

catflydotio · March 21, 2022, 8:38pm

I’m not sure which errors you mean. The logs you posted above (with [info] keeper) look to me like Stolon doing its thing. Do you have other reason to think things aren’t as they should be?

BTW fly status should be showing instances again.

philipbrown · March 22, 2022, 1:43pm

Here are the errors from Sentry:

(Sentry.CrashError ** (exit) exited in: GenServer.call(#PID<0.24656.0>, {:listen, [:leader]}, 5000)
    ** (EXIT) time out)

(DBConnection.ConnectionError tcp connect (top2.nearest.of.my-app-db.internal:5432): non-existing domain - :nxdomain)

kurt · March 22, 2022, 1:48pm

That :nxdomain error means the DNS lookup for that database wasn’t working when it tried to connect. Can you explain the timeline here? Your app was running, using the DB, then you got a flood of sentry errors yesterday?

philipbrown · March 22, 2022, 3:31pm

Yeah, I re-created the database for the app on the 18th. Yesterday I started inviting users to the app, and around the same time I got the errors through from Sentry and started to panic that I’d set up the database incorrectly.

So the database had been happily running from the 18th to the 21st.

Topic		Replies	Views
fly status for postgres apps	12	728	March 23, 2022
Production dabase unreachable all of a sudden	9	588	July 19, 2022
Both postgres instances are replica	1	285	March 5, 2022
Database down again (lhr)	6	748	April 11, 2021
No leader found	13	890	October 18, 2023

2 database instances but none showing in fly status

Related topics