pg-failover command not found

elliotdickison · November 21, 2023, 9:57pm

We’ve got a pg cluster on flyio/postgres-flex:15.2. Our leader crashed and we’re trying to failover to a new region following the instructions here: What is the correct process to change the postgres leader region? - #2 by shaun

We’ve updated the PRIMARY_REGION env variable and redeployed, but when we ssh into a host the pg-failover command does not exist. Any help would be appreciated!

roadmr · November 21, 2023, 10:02pm

Hi Elliot,

With Postgres 15.2 you should use the procedure described here; the one you linked to is for Stolon-based Fly Postgres which is not what you’re using (and no longer what we provision for new clusters):

Daniel

elliotdickison · November 21, 2023, 10:09pm

This is helpful. Any idea what to do in the scenario where a failover fails?

Performing a failover
Connecting to fdaa:0:22f0:a7b:106:38aa:7298:2... complete
Stopping current leader...  9080e693b0d9e8
Starting new leader
Promoting new leader...  e2865642ae6d78
Connecting to fdaa:0:22f0:a7b:106:38aa:7298:2... complete
NOTICE: promoting standby to primary
DETAIL: promoting server "fdaa:0:22f0:a7b:106:38aa:7298:2" (ID: 1100988338) using pg_promote()
NOTICE: waiting up to 60 seconds (parameter "promote_check_timeout") for promotion to complete
NOTICE: STANDBY PROMOTE successful
DETAIL: server "fdaa:0:22f0:a7b:106:38aa:7298:2" (ID: 1100988338) was successfully promoted to primary
NOTICE: executing STANDBY FOLLOW on 7 of 7 siblings
Waiting 30 seconds for the old leader to stop...
INFO: STANDBY FOLLOW successfully executed on all reachable sibling nodes
Error promoting new leader, restarting existing leader
Waiting for old leader to finish stopping

elliotdickison · November 21, 2023, 10:24pm

Update: Despite the log message Error promoting new leader, restarting existing leader, the new leader took. The old leader never came back, I ended up having to force destroy the machine but all seems stable again. We are very much looking forward to a managed postgres option from Fly

system · November 28, 2023, 10:24pm

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Regional failover fails because ssh connection is denied by publickey Questions / Help postgres	2	56	October 30, 2024
Postgres failover fails. Unable to connect via SSH postgres	1	247	May 14, 2024
Unable to perform postgres regional failover Questions / Help postgres	5	423	March 2, 2024
Postgres cluster broken since last Fly migration Questions / Help postgres	1	131	July 2, 2024
Postgres CLI & articles are confusing Questions / Help postgres	3	257	December 21, 2023

pg-failover command not found

Related topics