The Postgres app had been running fine for months, and all of a sudden has stopped and now failing to start… I have tried restarting, upgrading image, scaling, adding regions, all are producing the same result:
2022-10-10T02:34:52.593 app[99f02c4f] sjc [info] panic: error checking stolon status: cannot get cluster data: invalid character ‘\x02’ in string literal
Full log:
2022-10-10T02:34:34.201 runner[99f02c4f] sjc [info] Starting instance
2022-10-10T02:34:35.294 runner[99f02c4f] sjc [info] Configuring virtual machine
2022-10-10T02:34:35.295 runner[99f02c4f] sjc [info] Pulling container image
2022-10-10T02:34:41.466 runner[99f02c4f] sjc [info] Unpacking image
2022-10-10T02:34:49.302 runner[99f02c4f] sjc [info] Preparing kernel init
2022-10-10T02:34:49.475 runner[99f02c4f] sjc [info] Setting up volume 'pg_data'
2022-10-10T02:34:49.614 runner[99f02c4f] sjc [info] Configuring firecracker
2022-10-10T02:34:49.695 runner[99f02c4f] sjc [info] Starting virtual machine
2022-10-10T02:34:49.933 app[99f02c4f] sjc [info] Starting init (commit: bc249e4)...
2022-10-10T02:34:49.953 app[99f02c4f] sjc [info] Mounting /dev/vdc at /data w/ uid: 0, gid: 0 and chmod 0755
2022-10-10T02:34:49.956 app[99f02c4f] sjc [info] Preparing to run: `docker-entrypoint.sh start` as root
2022-10-10T02:34:49.972 app[99f02c4f] sjc [info] 2022/10/10 02:34:49 listening on [fdaa:0:xxx:a7b:x:2:xx:2]:22 (DNS: [fdaa::3]:53)
2022-10-10T02:34:50.019 app[99f02c4f] sjc [info] cluster spec filename /fly/cluster-spec.json
2022-10-10T02:34:50.019 app[99f02c4f] sjc [info] system memory: 256mb vcpu count: 1
2022-10-10T02:34:50.019 app[99f02c4f] sjc [info] {
2022-10-10T02:34:50.019 app[99f02c4f] sjc [info] "initMode": "new",
2022-10-10T02:34:50.019 app[99f02c4f] sjc [info] "existingConfig": {},
2022-10-10T02:34:50.019 app[99f02c4f] sjc [info] "pgParameters": {
2022-10-10T02:34:50.019 app[99f02c4f] sjc [info] "archive_command": "if [ $ENABLE_WALG ]; then /usr/local/bin/wal-g wal-push \"%p\"; fi",
2022-10-10T02:34:50.019 app[99f02c4f] sjc [info] "archive_mode": "on",
2022-10-10T02:34:50.019 app[99f02c4f] sjc [info] "archive_timeout": "60",
2022-10-10T02:34:50.019 app[99f02c4f] sjc [info] "effective_cache_size": "192MB",
2022-10-10T02:34:50.019 app[99f02c4f] sjc [info] "effective_io_concurrency": "200",
2022-10-10T02:34:50.019 app[99f02c4f] sjc [info] "maintenance_work_mem": "64MB",
2022-10-10T02:34:50.019 app[99f02c4f] sjc [info] "max_connections": "300",
2022-10-10T02:34:50.019 app[99f02c4f] sjc [info] "max_parallel_workers": "8",
2022-10-10T02:34:50.019 app[99f02c4f] sjc [info] "max_parallel_workers_per_gather": "2",
2022-10-10T02:34:50.019 app[99f02c4f] sjc [info] "max_wal_size": "1001MB",
2022-10-10T02:34:50.019 app[99f02c4f] sjc [info] "max_worker_processes": "8",
2022-10-10T02:34:50.019 app[99f02c4f] sjc [info] "min_wal_size": "250MB",
2022-10-10T02:34:50.019 app[99f02c4f] sjc [info] "random_page_cost": "1.1",
2022-10-10T02:34:50.019 app[99f02c4f] sjc [info] "shared_buffers": "64MB",
2022-10-10T02:34:50.019 app[99f02c4f] sjc [info] "wal_compression": "on",
2022-10-10T02:34:50.019 app[99f02c4f] sjc [info] "wal_keep_segments": "0",
2022-10-10T02:34:50.019 app[99f02c4f] sjc [info] "work_mem": "4MB"
2022-10-10T02:34:50.019 app[99f02c4f] sjc [info] },
2022-10-10T02:34:50.019 app[99f02c4f] sjc [info] "maxStandbysPerSender": 50,
2022-10-10T02:34:50.019 app[99f02c4f] sjc [info] "deadKeeperRemovalInterval": "1h"
2022-10-10T02:34:50.019 app[99f02c4f] sjc [info] }
2022-10-10T02:34:50.019 app[99f02c4f] sjc [info] generated new config
2022-10-10T02:34:50.020 app[99f02c4f] sjc [info] keeper | Running...
2022-10-10T02:34:50.041 app[99f02c4f] sjc [info] sentinel | Running...
2022-10-10T02:34:50.042 app[99f02c4f] sjc [info] proxy | Running...
2022-10-10T02:34:50.045 app[99f02c4f] sjc [info] exporter | Running...
2022-10-10T02:34:50.134 app[99f02c4f] sjc [info] exporter | INFO[0000] Starting Server: :9187 source="postgres_exporter.go:1837"
2022-10-10T02:34:50.151 app[99f02c4f] sjc [info] proxy | [WARNING] 282/023450 (553) : parsing [/fly/haproxy.cfg:38]: Missing LF on last line, file might have been truncated at position 96. This will become a hard error in HAProxy 2.3.
2022-10-10T02:34:50.168 app[99f02c4f] sjc [info] proxy | [NOTICE] 282/023450 (553) : New worker #1 (569) forked
2022-10-10T02:34:51.020 app[99f02c4f] sjc [info] checking stolon status
2022-10-10T02:34:52.238 app[99f02c4f] sjc [info] proxy | [WARNING] 282/023452 (569) : bk_db/pg1 changed its IP from (none) to fdaa:0:xx:a7b:xx:0:b268:2 by flydns/dns1.
2022-10-10T02:34:52.238 app[99f02c4f] sjc [info] proxy | [WARNING] 282/023452 (569) : Server bk_db/pg1 ('lax.xxxxx.internal') is UP/READY (resolves again).
2022-10-10T02:34:52.238 app[99f02c4f] sjc [info] proxy | [WARNING] 282/023452 (569) : Server bk_db/pg1 administratively READY thanks to valid DNS answer.
2022-10-10T02:34:52.593 app[99f02c4f] sjc [info] panic: error checking stolon status: cannot get cluster data: invalid character '\x02' in string literal
2022-10-10T02:34:52.593 app[99f02c4f] sjc [info] : exit status 1
2022-10-10T02:34:52.594 app[99f02c4f] sjc [info] goroutine 9 [running]:
2022-10-10T02:34:52.594 app[99f02c4f] sjc [info] main.main.func2(0xc0000ce000, 0xc000082a80)
2022-10-10T02:34:52.594 app[99f02c4f] sjc [info] /go/src/github.com/fly-examples/postgres-ha/cmd/start/main.go:81 +0x72c
2022-10-10T02:34:52.594 app[99f02c4f] sjc [info] created by main.main
2022-10-10T02:34:52.594 app[99f02c4f] sjc [info] /go/src/github.com/fly-examples/postgres-ha/cmd/start/main.go:72 +0x43b