PG Primary machine keeps turning into Zombie SIN

Hi Folks!

It has been 2 months of peaceful and stable operation in SIN region since my PG machine turned into pumpkin and my app failed to load. Here we go again.

Please have a look a the recent logs.

2025-03-04T18:20:15.670 app[178103db9120d8] sin [info] 2025-03-04T18:20:15.670637105 [01JM1DW6T6M85M0FAQR0EYJEVH:main] Running Firecracker v1.7.0
2025-03-04T18:20:16.598 app[178103db9120d8] sin [info] INFO Starting init (commit: 67f51b8b)...
2025-03-04T18:20:16.600 app[178103db9120d8] sin [info] [ 0.836254] I/O error, dev vda, sector 2 op 0x0:(READ) flags 0x1000 phys_seg 1 prio class 0
2025-03-04T18:20:16.600 app[178103db9120d8] sin [info] [ 0.837132] EXT4-fs (vda): unable to read superblock
2025-03-04T18:20:16.601 app[178103db9120d8] sin [info] ERROR Error: couldn't mount /dev/vda onto /lower/dev/vda, because: EIO: I/O error
2025-03-04T18:20:16.602 app[178103db9120d8] sin [info] [ 0.838866] reboot: Restarting system
2025-03-04T18:20:16.677 app[178103db9120d8] sin [warn] Virtual machine exited abruptly

I used to fix exactly the same problem following the instruction provided by someone on forum

Please help me to understand why my PG keeps failing over and over again and how can I make my DB more robust. I have total 3 VM in 3 different regions. Unfortunately as I understood later there is no use of any other machines as long as their role is “Replica”…

This sounds like a filing system error. Do you get this on all three of your volumes?

I am wondering if you might need to do some FS repair, with fsck or the like.

No. This problem is only on the primary machine in SIN another 2 replicas in a different region having 0 problems

Well, that is good news. Are you taking off-site backups, in case it goes awry?

Ah, I just spotted this is probably a duplicate report:

No. I had to do backup using the image generated and stored by fly.io

Any suggestions how to fix improved my DB?

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.