Barman Failing Checks

A machine in our postgres cluster went down and there was an automatic failover to a new primary node. Ever since that, health checks have been failing for our barman machine. Nothing I do will get it to resume working properly.

The connection health check is failing: 500 Internal Server Error [✗] connection: failed running `barman check pg` (694ns)

The output of barman check gives this:

	PostgreSQL: OK
	superuser or standard user with backup privileges: OK
	PostgreSQL streaming: OK
	wal_level: OK
	replication slot: FAILED (slot 'barman' not initialised: is 'receive-wal' running?)
	directories: OK
	retention policy settings: OK
	backup maximum age: OK (no last_backup_maximum_age provided)
	backup minimum size: OK (745.0 MiB)
	wal maximum age: OK (no last_wal_maximum_age provided)
	wal size: OK (0 B)
	compression settings: OK
	failed backups: OK (there are 0 failed backups)
	minimum redundancy requirements: OK (have 1 backups, expected at least 0)
	pg_basebackup: OK
	pg_basebackup compatible: OK
	pg_basebackup supports tablespaces mapping: OK
	systemid coherence: OK
	pg_receivexlog: OK
	pg_receivexlog compatible: OK
	receive-wal running: FAILED (See the Barman log file for more details)
	archiver errors: OK
Error: ssh shell: Process exited with status 1

The following repeatedly shows in the logs for the barman machine.


2024-05-18T00:00:02.171 app[2866e1dc723718] dfw [info] barman | 2024-05-18 00:00:02,169 [691] barman.postgres WARNING: Error retrieving PostgreSQL status: connection to server at "ts-postgres.internal" (fdaa:9:af2:a7b:253:2488:9ae9:2), port 5432 failed: Connection refused

2024-05-18T00:00:02.276 app[2866e1dc723718] dfw [info] barman | 2024-05-18 00:00:02,276 [691] barman.server ERROR: Check 'no access to monitoring functions' failed for server 'pg'

2024-05-18T00:00:02.276 app[2866e1dc723718] dfw [info] barman | 2024-05-18 00:00:02,276 [691] barman.server ERROR: Check 'PostgreSQL streaming (WAL streaming)' failed for server 'pg'

2024-05-18T00:00:02.278 app[2866e1dc723718] dfw [info] barman | 2024-05-18 00:00:02,277 [691] barman.server ERROR: Impossible to start WAL streaming. Check the log for more details, or run 'barman check pg'

2024-05-18T00:00:15.475 app[2866e1dc723718] dfw [info] barman | 2024-05-18 00:00:15,474 [694] barman.config WARNING: Discarding configuration file: .barman.auto.conf (not a file)

2024-05-18T00:00:15.658 app[2866e1dc723718] dfw [info] barman | 2024-05-18 00:00:15,658 [694] barman.server ERROR: Check 'replication slot' failed for server 'pg'

2024-05-18T00:00:15.667 app[2866e1dc723718] dfw [info] barman | 2024-05-18 00:00:15,666 [694] barman.server ERROR: Check 'receive-wal running' failed for server 'pg'

2024-05-18T00:00:30.823 app[2866e1dc723718] dfw [info] barman | 2024-05-18 00:00:30,823 [697] barman.config WARNING: Discarding configuration file: .barman.auto.conf (not a file)

2024-05-18T00:00:31.006 app[2866e1dc723718] dfw [info] barman | 2024-05-18 00:00:31,006 [697] barman.server ERROR: Check 'replication slot' failed for server 'pg'

2024-05-18T00:00:31.014 app[2866e1dc723718] dfw [info] barman | 2024-05-18 00:00:31,014 [697] barman.server ERROR: Check 'receive-wal running' failed for server 'pg'

2024-05-18T00:00:46.178 app[2866e1dc723718] dfw [info] barman | 2024-05-18 00:00:46,178 [700] barman.config WARNING: Discarding configuration file: .barman.auto.conf (not a file)

2024-05-18T00:00:46.363 app[2866e1dc723718] dfw [info] barman | 2024-05-18 00:00:46,363 [700] barman.server ERROR: Check 'replication slot' failed for server 'pg'

2024-05-18T00:00:46.371 app[2866e1dc723718] dfw [info] barman | 2024-05-18 00:00:46,371 [700] barman.server ERROR: Check 'receive-wal running' failed for server 'pg'

2024-05-18T00:01:01.520 app[2866e1dc723718] dfw [info] barman | 2024-05-18 00:01:01,519 [703] barman.config WARNING: Discarding configuration file: .barman.auto.conf (not a file)

2024-05-18T00:01:01.694 app[2866e1dc723718] dfw [info] barman | 2024-05-18 00:01:01,694 [703] barman.server ERROR: Check 'replication slot' failed for server 'pg'

2024-05-18T00:01:01.703 app[2866e1dc723718] dfw [info] barman | 2024-05-18 00:01:01,703 [703] barman.server ERROR: Check 'receive-wal running' failed for server 'pg'

2024-05-18T00:01:02.208 app[2866e1dc723718] dfw [info] barman | 2024-05-18 00:01:02,208 [710] barman.config WARNING: Discarding configuration file: .barman.auto.conf (not a file)

2024-05-18T00:01:02.256 app[2866e1dc723718] dfw [info] barman | 2024-05-18 00:01:02,255 [710] barman.utils INFO: Cleaning up lockfiles directory.

2024-05-18T00:01:02.518 app[2866e1dc723718] dfw [info] barman | 2024-05-18 00:01:02,517 [712] barman.config WARNING: Discarding configuration file: .barman.auto.conf (not a file)

2024-05-18T00:01:02.521 app[2866e1dc723718] dfw [info] barman | 2024-05-18 00:01:02,521 [713] barman.config WARNING: Discarding configuration file: .barman.auto.conf (not a file)

2024-05-18T00:01:02.541 app[2866e1dc723718] dfw [info] barman | 2024-05-18 00:01:02,541 [712] barman.wal_archiver INFO: No xlog segments found from streaming for pg.```

Any ideas on how to fix barman?

Solved this issue just by destroying the barman machine and recreating it.

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.