Postgres troubles

quisprof · June 7, 2024, 11:53am

Hi, i have problem with my fly postgres instance.

500 Internal Server Error failed to connect to local node: failed to connect to host user=repmgr database=repmgr`: server error (FATAL: the database system is in recovery mode (SQLSTATE 57P03)

shaun · June 7, 2024, 2:46pm

Hey there, could you provide more information?

fly status --app <pg-app-name>

quisprof · June 7, 2024, 8:19pm

postgres status

john-fly · June 8, 2024, 12:10am

Hi Quis,

Looking at your app from the backend, it looks like you originally had a cluster of three nodes, right? And then you destroyed two?

It looks like however you did it, the cluster leader did not realize that it now forms a cluster of one. Try adding a new Machine with fly m clone to restore your cluster to health.

quisprof · June 8, 2024, 10:53am

We did delete two instances of the database cluster, but everything was fine for about a week.

fly m clone doesn’t help. With attach volume too.

john-fly · June 8, 2024, 6:46pm

It’s not immediately clear to me what’s going on but I am not a Fly PG expert. My colleague Shaun above is, but it’s the weekend so I’m not sure when he might get back to you. If you want to get back in service ASAP, I would recommend the following.

Create a new cluster from the existing one.

fly pg create --initial-cluster-size 1 --fork-from quispostgres -n <NEW_PG_APP_NAME>

This will create a new cluster with the data from your current cluster. If by chance that new cluster doesn’t come up, then there’s some sort of data corruption issue, so you should restore from a snapshot made prior to this occurring:

fly pg create fly pg create --initial-cluster-size 1 --snapshot-id <SNAPSHOT_ID> -n <NEW_PG_APP_NAME>

Then, for each Fly App that uses Fly PG on the backend, do the following (note that “DATABASE_URL” is literal; everything else in caps and tagged should be replaced):

# Remove the old database config from app
fly secrets unset -a <YOUR_FRONTEND_APP> --stage DATABASE_URL
# Add the config for the new database
fly pg attach -a <YOUR_FRONTEND_APP> --database-user <NEW_DB_USER_NAME> --database-name <OLD_DB_NAME> <NEW_PG_APP_NAME>
fly secrets deploy -a <YOUR_FRONTEND_APP>

quisprof · June 9, 2024, 1:25pm

Yes, I have already done this, but creating a new database cluster is only possible with a snapshot from 2 days ago. The same error occurs with a more recent snapshot.

At the moment we have lost data for 2 days
Whenever possible, we need to know at least the reason for this decline. To avoid it in the future.

frsatneedle · June 10, 2024, 3:45am

This is happening to me too.

quisprof · June 10, 2024, 1:06pm

Hi, check pls my case

shaun · June 10, 2024, 2:54pm

@quisprof @frsatneedle Mind sending me the name of your PG apps?

quisprof · June 10, 2024, 6:19pm

Problems started in the “quispostgres” application

shaun · June 11, 2024, 4:09pm

I took a look at your setup and it appears your instance is in recovery mode. This state is reserved for replicas that fall out of sync with the Primary. I also noticed you have quite a few lingering Volumes tied to this App. Did you accidently remove your Primary instance and or attach the wrong Volume to your machine?

quisprof · June 11, 2024, 4:21pm

Initially it was a cluster of 3 instances. But in order to optimize resources, I left only one machine and everything was fine for about a week, and then suddenly it wasn’t.

system · June 18, 2024, 4:21pm

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Postgres troubles. My app stopped :( Questions / Help postgres	1	130	June 14, 2024
Postgres SQL down on Fly.io postgres	3	242	March 13, 2024
Connection Issues on Fly Postgres Region Singapore Questions / Help elixir , postgres	4	381	January 12, 2024
Unable to connect to my postgres instance Questions / Help postgres	6	590	February 22, 2023
My application and postgres are down postgres	1	257	December 30, 2022

Postgres troubles

Related topics