LHR volume unmountable

Hi, I’m getting the following error for the past ~6 hours:

2024-04-17T20:10:17.021 runner[6e82e5efe74568] lhr [info] Pulling container image registry.fly.io/evolutionracingacademy:deployment-01HVPTRBNXCJBARFG2YJX7TD17
2024-04-17T20:10:49.429 runner[6e82e5efe74568] lhr [info] Successfully prepared image registry.fly.io/evolutionracingacademy:deployment-01HVPTRBNXCJBARFG2YJX7TD17 (32.408301292s)
2024-04-17T20:10:49.448 runner[6e82e5efe74568] lhr [info] Setting up volume 'data'
2024-04-17T20:10:49.448 runner[6e82e5efe74568] lhr [info] Opening encrypted volume
2024-04-17T20:10:51.224 runner[6e82e5efe74568] lhr [info] Configuring firecracker
2024-04-17T20:10:51.477 app[6e82e5efe74568] lhr [info] [ 0.037295] PCI: Fatal: No config space access function found
2024-04-17T20:10:51.653 health[6e82e5efe74568] lhr [warn] Health check on port 8080 is in a 'warning' state. Your app may not be responding properly. Services exposed on ports [80, 443] may have intermittent failures until the health check passes.
2024-04-17T20:10:51.653 health[6e82e5efe74568] lhr [warn] Health check on port 8080 is in a 'warning' state. Your app may not be responding properly. Services exposed on ports [80, 443] may have intermittent failures until the health check passes.
2024-04-17T20:10:51.752 app[6e82e5efe74568] lhr [info] INFO Starting init (commit: 65db7f7)...
2024-04-17T20:10:51.770 app[6e82e5efe74568] lhr [info] INFO Mounting /dev/vdd at /data w/ uid: 0, gid: 0 and chmod 0755
2024-04-17T20:10:51.772 app[6e82e5efe74568] lhr [info] [ 0.328546] EXT4-fs (vdd): VFS: Found ext4 filesystem with invalid superblock checksum. Run e2fsck?
2024-04-17T20:10:51.773 app[6e82e5efe74568] lhr [info] ERROR Error: couldn't mount /dev/vdd onto /data, because: EBADMSG: Not a data message
2024-04-17T20:10:51.774 app[6e82e5efe74568] lhr [info] [ 0.331646] reboot: Restarting system
2024-04-17T20:10:51.837 app[6e82e5efe74568] lhr [warn] Virtual machine exited abruptly
2024-04-17T20:10:53.416 runner[6e82e5efe74568] lhr [info] machine exited with exit code 0, not restarting

The issue persists if I try scaling down to 0 machines and back up to 1. There are no status notifications on my dashboard currently, but there was the following (now resolved) from about 6 hours ago:

We are performing emergency maintenance on a host some of your apps instances are running on. Apps may be unavailable until the maintenance is completed.

Can someone help me get my service back online? Thanks.

Think I managed to work around this by:

  1. Creating a new volume from a snapshot of the “broken” one (which confusingly then shows size 0 until it’s attached to a machine).
  2. Scaling the number of app instances to 2, so the new machine picks up the second volume (why can’t I specify what volume a machine should use?)
  3. Pausing the machine with the broken volume.
  4. Scaling back down to 1 app instance (destroying the now paused machine).

And I should presumably destroy the broken instance, once I’ve confirmed the new volume is creating snapshots.

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.