Hi all, I have a problem since this morning. I have my machine with my postgres db with primary machine and now is on read only mode since this morning and my client calls me. I didn´t do any action in db since 2 months. And I saw the primary is inestable and replica is showing the data but you can’t modify, create or delete in replica, it´s obvious, is on read only.
I was seeing the primary machine has all time 2/3 health and replica has 3/3. The region where are located the machines are cds (Paris, France) I couldn`t do any action.
I try to put primary my replica but it´s impossible. I increment the resources and wait about 1 hour. Nothing, and my client is losing time and money…
Hi… Sorry to hear you’re having trouble with this, … Most of us here in the community forum cannot poke around in your app settings, logs, etc.; we can only go by what you yourself post in the thread.
It would help to know the output of the following commands, for example: fly status -a db-app-name, fly m list -a db-app-name, fly checks list -a db-app-name.
(It sounds in particular like you might have one of the “doubly deprecated” Stolon-based clusters, since the PG Flex ones should have 3 or more Machines in the cluster.)
Hello. Your case called my attention, I looked at your database and I think the primary is operational now.
If you have customers and a business my recommendation is to switch to Managed Postgres.
That said, this is what I did to recover yours.
root@683d527add7458:/# su - postgres
postgres@683d527add7458:~$ repmgr cluster show
WARNING: node "fdaa:1:c9f6:a7b:1bf:47dd:bf6e:2" not found in "pg_stat_replication"
ID | Name | Role | Status | Upstream | Location | Priority | Timeline | Connection string
------------+---------------------------------+---------+----------------------+-----------------------------------+----------+----------+----------+--------------------------------------------------------------------------------------------
520037516 | fdaa:1:c9f6:a7b:1bf:47dd:bf6e:2 | standby | running | ! fdaa:1:c9f6:a7b:1be:f615:226a:2 | cdg | 100 | 1 | host=fdaa:1:c9f6:a7b:1bf:47dd:bf6e:2 port=5433 user=repmgr dbname=repmgr connect_timeout=5
1217570335 | fdaa:1:c9f6:a7b:1be:f615:226a:2 | primary | ! running as standby | | mad | 100 | 1 | host=fdaa:1:c9f6:a7b:1be:f615:226a:2 port=5433 user=repmgr dbname=repmgr connect_timeout=5
WARNING: following issues were detected
- node "fdaa:1:c9f6:a7b:1bf:47dd:bf6e:2" (ID: 520037516) is not attached to its upstream node "fdaa:1:c9f6:a7b:1be:f615:226a:2" (ID: 1217570335)
- node "fdaa:1:c9f6:a7b:1be:f615:226a:2" (ID: 1217570335) is registered as primary but running as standby
Notice the “primary running as standby”, that the signal to promote the node back to primary.
postgres@683d527add7458:~$ repmgr standby promote
NOTICE: promoting standby to primary
DETAIL: promoting server "fdaa:1:c9f6:a7b:1be:f615:226a:2" (ID: 1217570335) using pg_promote()
NOTICE: waiting up to 60 seconds (parameter "promote_check_timeout") for promotion to complete
NOTICE: STANDBY PROMOTE successful
DETAIL: server "fdaa:1:c9f6:a7b:1be:f615:226a:2" (ID: 1217570335) was successfully promoted to primary
and now run repmgr cluster show to be sure it is working