Postgres machine error: EXT4-fs (vda): unable to read superblock

Hi,
After 7 months without issues, today one of my postgres db died and can’t start again.

Machine is in the AMS region,

flyio/postgres-flex:15.3@sha256:44b698752cf113110f2fa72443d7fe452b48228aafbb0d93045ef1e3282360a6

shared-cpu-1x@256MB

Error I see:

2025-01-21T17:50:57.476 app[2866ed7be01728] ams [info] 2025-01-21T17:50:57.476735740 [01J1EFCDG2747FAQEDEK31EFEK:main] Running Firecracker v1.7.0

2025-01-21T17:50:58.130 app[2866ed7be01728] ams [info] INFO Starting init (commit: ad092ccf)…

2025-01-21T17:50:58.141 app[2866ed7be01728] ams [info] [ 0.594263] blk_update_request: I/O error, dev vda, sector 2 op 0x0:(READ) flags 0x1000 phys_seg 1 prio class 0

2025-01-21T17:50:58.142 app[2866ed7be01728] ams [info] [ 0.595538] EXT4-fs (vda): unable to read superblock

2025-01-21T17:50:58.143 app[2866ed7be01728] ams [info] ERROR Error: couldn’t mount /dev/vda onto /lower/dev/vda, because: EIO: I/O error

2025-01-21T17:50:58.143 app[2866ed7be01728] ams [info] [ 0.597229] reboot: Restarting system

2025-01-21T17:50:58.202 app[2866ed7be01728] ams [warn] Virtual machine exited abruptly

2025-01-21T17:50:58.286 runner[2866ed7be01728] ams [info] machine exited with exit code 0, not restarting

Any idea what might be the issue?

Thank you

Hi… This looks like low-level disk corruption on the root partition.

Are there multiple Machines in this PG cluster?

No, only one.
Should I delete the machine and restore the db backup?

It looks like your data volume may have survived, so it’s probably worth trying the volume-forking approach first:

https://community.fly.io/t/urgency-problems-with-postgres-the-database-is-not-responding/19926/2

(Tweak --initial-cluster-size as desired, of course.)

Hope this helps!

1 Like

I’ve been trying to track down these errors recently. I’m not sure if it’s possible for users to fix it directly, but I fixed the instance here from the host. Your database is running again now, sorry for the trouble.

1 Like

Don’t worry and thank you, it’s working :slight_smile:

Same problem here, @Lillian - is there a way to self serve this or auto-fix?

2025-01-22T15:35:44Z health[e2860e6be63d86] sin [error]Health check on port 3000 has failed. Your app is not responding properly. Services exposed on ports [80, 443] will have intermittent failures until the health check passes.

2025-01-22T15:35:45Z app[e2860e6be63d86] sin [info]2025-01-22T15:35:45.128641039 [01HZQMWMDB8HZ7TW56Z7PYRQZ3:main] Running Firecracker v1.7.0

2025-01-22T15:35:45Z app[e2860e6be63d86] sin [info] INFO Starting init (commit: a6222593)...

2025-01-22T15:35:45Z app[e2860e6be63d86] sin [info][ 0.346624] blk_update_request: I/O error, dev vda, sector 2 op 0x0:(READ) flags 0x1000 phys_seg 1 prio class 0

2025-01-22T15:35:45Z app[e2860e6be63d86] sin [info][ 0.347623] EXT4-fs (vda): unable to read superblock

2025-01-22T15:35:45Z app[e2860e6be63d86] sin [info]ERROR Error: couldn't mount /dev/vda onto /lower/dev/vda, because: EIO: I/O error

2025-01-22T15:35:45Z app[e2860e6be63d86] sin [info][ 0.349152] reboot: Restarting system

2025-01-22T15:35:45Z app[e2860e6be63d86] sin [warn]Virtual machine exited abruptly

2025-01-22T15:35:45Z runner[e2860e6be63d86] sin [info]machine exited with exit code 0, not restarting

2025-01-22T15:38:03Z proxy[e2860e6be63d86] sin [info]Starting machine

2025-01-22T15:38:03Z app[e2860e6be63d86] sin [info]2025-01-22T15:38:03.734337892 [01HZQMWMDB8HZ7TW56Z7PYRQZ3:main] Running Firecracker v1.7.0

2025-01-22T15:38:04Z app[e2860e6be63d86] sin [info] INFO Starting init (commit: a6222593)...

2025-01-22T15:38:04Z app[e2860e6be63d86] sin [info][ 0.366729] blk_update_request: I/O error, dev vda, sector 2 op 0x0:(READ) flags 0x1000 phys_seg 1 prio class 0

2025-01-22T15:38:04Z app[e2860e6be63d86] sin [info][ 0.367684] EXT4-fs (vda): unable to read superblock

2025-01-22T15:38:04Z app[e2860e6be63d86] sin [info]ERROR Error: couldn't mount /dev/vda onto /lower/dev/vda, because: EIO: I/O error

2025-01-22T15:38:04Z app[e2860e6be63d86] sin [info][ 0.369257] reboot: Restarting system

2025-01-22T15:38:04Z app[e2860e6be63d86] sin [warn]Virtual machine exited abruptly

2025-01-22T15:38:04Z runner[e2860e6be63d86] sin [info]machine exited with exit code 0, not restarting

2025-01-22T15:38:08Z proxy[e2860e6be63d86] sin [error][PM03] could not wake up machine due to a timeout requesting from the machines API

2025-01-22T15:38:08Z proxy[e2860e6be63d86] sin [info]Starting machine

2025-01-22T15:38:08Z proxy[e2860e6be63d86] sin [error][PM01] machines API returned an error: "machine still attempting to start"

2025-01-22T15:38:09Z proxy[e2860e6be63d86] sin [info]Starting machine

2025-01-22T15:38:09Z app[e2860e6be63d86] sin [info]2025-01-22T15:38:09.172783540 [01HZQMWMDB8HZ7TW56Z7PYRQZ3:main] Running Firecracker v1.7.0

2025-01-22T15:38:09Z app[e2860e6be63d86] sin [info] INFO Starting init (commit: a6222593)...

2025-01-22T15:38:09Z app[e2860e6be63d86] sin [info][ 0.362223] blk_update_request: I/O error, dev vda, sector 2 op 0x0:(READ) flags 0x1000 phys_seg 1 prio class 0

2025-01-22T15:38:09Z app[e2860e6be63d86] sin [info][ 0.373796] EXT4-fs (vda): unable to read superblock

2025-01-22T15:38:09Z app[e2860e6be63d86] sin [info]ERROR Error: couldn't mount /dev/vda onto /lower/dev/vda, because: EIO: I/O error

2025-01-22T15:38:09Z app[e2860e6be63d86] sin [info][ 0.376588] reboot: Restarting system

2025-01-22T15:38:09Z app[e2860e6be63d86] sin [warn]Virtual machine exited abruptly

2025-01-22T15:38:14Z proxy[e2860e6be63d86] sin [error][PM03] could not wake up machine due to a timeout requesting from the machines API

2025-01-22T15:38:15Z proxy[e2860e6be63d86] sin [info]Starting machine

2025-01-22T15:38:15Z proxy[e2860e6be63d86] sin [error][PM01] machines API returned an error: "machine still attempting to start"

2025-01-22T15:38:17Z proxy[e2860e6be63d86] sin [info]Starting machine

2025-01-22T15:38:17Z proxy[e2860e6be63d86] sin [error][PM01] machines API returned an error: "machine still attempting to start"

2025-01-22T15:38:19Z proxy[e2860e6be63d86] sin [info]Starting machine

2025-01-22T15:38:19Z proxy[e2860e6be63d86] sin [error][PM01] machines API returned an error: "machine still attempting to start"

2025-01-22T15:38:21Z proxy[e2860e6be63d86] sin [info]Starting machine

2025-01-22T15:38:21Z proxy[e2860e6be63d86] sin [error][PM01] machines API returned an error: "machine still attempting to start"

2025-01-22T15:38:23Z proxy[e2860e6be63d86] sin [info]Starting machine

2025-01-22T15:38:23Z proxy[e2860e6be63d86] sin [error][PM01] machines API returned an error: "machine still attempting to start"

2025-01-22T15:38:25Z proxy[e2860e6be63d86] sin [info]Starting machine

2025-01-22T15:38:25Z proxy[e2860e6be63d86] sin [error][PM01] machines API returned an error: "machine still attempting to start"

2025-01-22T15:38:27Z proxy[e2860e6be63d86] sin [info]Starting machine

2025-01-22T15:38:27Z proxy[e2860e6be63d86] sin [error][PM01] machines API returned an error: "machine still attempting to start"

2025-01-22T15:38:29Z proxy[e2860e6be63d86] bom [error][PR04] could not find a good candidate within 21 attempts at load balancing

Same problem here in ams:

2025-01-22T21:35:58Z app[7815de4b279d28] ams [info]2025-01-22T21:35:58.723124036 [01JCX21J4AR5FG9HA9B2VZGR2N:main] Running Firecracker v1.7.0
2025-01-22T21:35:59Z app[7815de4b279d28] ams [info] INFO Starting init (commit: 74e923d)...
2025-01-22T21:35:59Z app[7815de4b279d28] ams [info][    0.635262] blk_update_request: I/O error, dev vda, sector 2 op 0x0:(READ) flags 0x1000 phys_seg 1 prio class 0
2025-01-22T21:35:59Z app[7815de4b279d28] ams [info][    0.636392] EXT4-fs (vda): unable to read superblock
2025-01-22T21:35:59Z app[7815de4b279d28] ams [info]ERROR Error: couldn't mount /dev/vda onto /lower/dev/vda, because: EIO: I/O error
2025-01-22T21:35:59Z app[7815de4b279d28] ams [info][    0.637896] reboot: Restarting system
2025-01-22T21:35:59Z app[7815de4b279d28] ams [warn]Virtual machine exited abruptly

Edit: fixed by someone from support. Thx!

Happened again this evening,

2025-01-24T20:39:02.684 app[2866ed7be01728] ams [info] 2025-01-24T20:39:02.684838956 [01JJ5CWF7YCD6DQF8PNGMNVMFD:main] Running Firecracker v1.7.0

2025-01-24T20:39:03.393 app[2866ed7be01728] ams [info] INFO Starting init (commit: 9551f38)…

2025-01-24T20:39:03.395 app[2866ed7be01728] ams [info] [ 0.631174] blk_update_request: I/O error, dev vda, sector 2 op 0x0:(READ) flags 0x1000 phys_seg 1 prio class 0

2025-01-24T20:39:03.395 app[2866ed7be01728] ams [info] [ 0.632115] EXT4-fs (vda): unable to read superblock

2025-01-24T20:39:03.396 app[2866ed7be01728] ams [info] ERROR Error: couldn’t mount /dev/vda onto /lower/dev/vda, because: EIO: I/O error

2025-01-24T20:39:03.396 app[2866ed7be01728] ams [info] [ 0.633363] reboot: Restarting system

2025-01-24T20:39:03.453 app[2866ed7be01728] ams [warn] Virtual machine exited abruptly

2025-01-24T20:39:03.525 runner[2866ed7be01728] ams [info] machine exited with exit code 0, not restarting

Issue occurring again. Is anyone from support able to fix this? I also sent an email to my support email

I am currenly experiencing the same issue. Any way to fix it without help from support?

I had to recover from backups that I store outside of Fly. Recovering from Fly volume didn’t work either.

Following these steps I was able to recover from a backup on fly.io and do a dump on my machine.
Wasn’t able to reattach the application but seems doable.

1 Like

Hi folks,

We have still been investigating this issue affecting a small number of Machines with an unavailable rootfs (/dev/vda) on a few hosts. It does seem that the previous manual fix we applied last week was reverted on the at least the ams host. I applied another fix which should once again unblock those instances (a few dozen total).

We’re hopeful the fix will be permanent this time, but if you continue to have trouble, you should be able to destroy the misbehaving Machine and launch a new one as well.

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.