Emergency maintenance/ corrupt volumes

After the “emergency maintenance”, none of my machines is able to restart, apparently due to corrupted volumes. What’s recommended in terms of getting these back online without losing data?

They are all 1 machine deployments with 1 persistent volume.

2024-09-28T13:38:53.586 proxy[6830139f723248] lhr [info] Starting machine

2024-09-28T13:38:54.187 app[6830139f723248] lhr [info] [ 0.268739] PCI: Fatal: No config space access function found

2024-09-28T13:38:54.530 app[6830139f723248] lhr [info] INFO Starting init (commit: 20f21dc5f)…

2024-09-28T13:38:54.555 app[6830139f723248] lhr [info] INFO Mounting /dev/vdd at /app/data w/ uid: 0, gid: 0 and chmod 0755

2024-09-28T13:38:54.562 app[6830139f723248] lhr [info] [ 0.639724] EXT4-fs (vdd): VFS: Found ext4 filesystem with invalid superblock checksum. Run e2fsck?

2024-09-28T13:38:54.563 app[6830139f723248] lhr [info] ERROR Error: couldn’t mount /dev/vdd onto /app/data, because: EBADMSG: Not a data message

2024-09-28T13:38:54.564 app[6830139f723248] lhr [info] [ 0.642225] reboot: Restarting system

2024-09-28T13:38:54.632 app[6830139f723248] lhr [warn] Virtual machine exited abruptly

2024-09-28T13:40:13.124 proxy[6830139f723248] lhr [info] Starting machine

2024-09-28T13:40:13.127 proxy[6830139f723248] lhr [info] Starting machine

2024-09-28T13:40:13.129 proxy[6830139f723248] lhr [error] [PM01] machines API returned an error: “machine still attempting to start”

2024-09-28T13:40:13.193 proxy[6830139f723248] lhr [info] Starting machine

2024-09-28T13:40:13.196 proxy[6830139f723248] lhr [info] Starting machine

2024-09-28T13:40:13.196 proxy[6830139f723248] lhr [error] [PM01] machines API returned an error: “rate limit exceeded”

2024-09-28T13:40:13.453 proxy[6830139f723248] lhr [info] Starting machine

2024-09-28T13:40:13.454 proxy[6830139f723248] lhr [info] Starting machine

2024-09-28T13:40:13.454 proxy[6830139f723248] lhr [error] [PM01] machines API returned an error: “rate limit exceeded”

2024-09-28T13:40:14.404 proxy[6830139f723248] lhr [info] Starting machine

2024-09-28T13:40:14.405 proxy[6830139f723248] lhr [info] Starting machine

2024-09-28T13:40:14.406 proxy[6830139f723248] lhr [error] [PM01] machines API returned an error: “rate limit exceeded”

2024-09-28T13:40:16.651 proxy[6830139f723248] lhr [info] Starting machine

2024-09-28T13:40:16.653 proxy[6830139f723248] lhr [info] Starting machine

2024-09-28T13:40:16.654 proxy[6830139f723248] lhr [error] [PM01] machines API returned an error: “machine still attempting to start”

2024-09-28T13:40:18.663 proxy[6830139f723248] lhr [info] Starting machine

2024-09-28T13:40:18.664 proxy[6830139f723248] lhr [info] Starting machine

2024-09-28T13:40:18.665 proxy[6830139f723248] lhr [error] [PM01] machines API returned an error: “machine still attempting to start”

2024-09-28T13:40:20.675 proxy[6830139f723248] lhr [info] Starting machine

2024-09-28T13:40:20.676 proxy[6830139f723248] lhr [info] Starting machine

2024-09-28T13:40:20.678 proxy[6830139f723248] lhr [error] [PM01] machines API returned an error: “machine still attempting to start”

2024-09-28T13:40:22.687 proxy[6830139f723248] lhr [info] Starting machine

2024-09-28T13:40:22.689 proxy[6830139f723248] lhr [info] Starting machine

2024-09-28T13:40:22.690 proxy[6830139f723248] lhr [error] [PM01] machines API returned an error: “machine still attempting to start”

2024-09-28T13:40:25.361 proxy[6830139f723248] lhr [info] Starting machine

2024-09-28T13:40:25.363 proxy[6830139f723248] lhr [info] Starting machine

2024-09-28T13:40:25.364 proxy[6830139f723248] lhr [error] [PM01] machines API returned an error: “machine still attempting to start”

2024-09-28T13:40:25.423 proxy[6830139f723248] lhr [info] Starting machine

2024-09-28T13:40:25.425 proxy[6830139f723248] lhr [info] Starting machine

2024-09-28T13:40:25.425 proxy[6830139f723248] lhr [error] [PM01] machines API returned an error: “rate limit exceeded”

2024-09-28T13:40:25.648 proxy[6830139f723248] lhr [info] Starting machine

2024-09-28T13:40:25.648 proxy[6830139f723248] lhr [info] Starting machine

2024-09-28T13:40:25.648 proxy[6830139f723248] lhr [error] [PM01] machines API returned an error: “rate limit exceeded”

2024-09-28T13:40:26.826 proxy[6830139f723248] lhr [info] Starting machine

2024-09-28T13:40:26.827 proxy[6830139f723248] lhr [info] Starting machine

2024-09-28T13:40:26.828 proxy[6830139f723248] lhr [error] [PM01] machines API returned an error: “rate limit exceeded”

2024-09-28T13:40:28.836 proxy[6830139f723248] lhr [info] Starting machine

2024-09-28T13:40:28.838 proxy[6830139f723248] lhr [info] Starting machine

2024-09-28T13:40:28.839 proxy[6830139f723248] lhr [error] [PM01] machines API returned an error: “machine still attempting to start”

2024-09-28T13:40:30.849 proxy[6830139f723248] lhr [info] Starting machine

2024-09-28T13:40:30.851 proxy[6830139f723248] lhr [info] Starting machine

2024-09-28T13:40:30.852 proxy[6830139f723248] lhr [error] [PM01] machines API returned an error: “machine still attempting to start”

2024-09-28T13:40:32.862 proxy[6830139f723248] lhr [info] Starting machine

2024-09-28T13:40:32.864 proxy[6830139f723248] lhr [info] Starting machine

2024-09-28T13:40:32.865 proxy[6830139f723248] lhr [error] [PM01] machines API returned an error: “machine still attempting to start”

2024-09-28T13:40:34.874 proxy[6830139f723248] lhr [info] Starting machine

2024-09-28T13:40:34.876 proxy[6830139f723248] lhr [info] Starting machine

2024-09-28T13:40:34.878 proxy[6830139f723248] lhr [error] [PM01] machines API returned an error: “machine still attempting to start”

2024-09-28T13:43:58.859 runner[6830139f723248] lhr [info] machine exited with exit code 0, not restarting

2024-09-28T13:44:11.380 proxy[6830139f723248] lhr [info] Starting machine

2024-09-28T13:44:11.996 app[6830139f723248] lhr [info] [ 0.264862] PCI: Fatal: No config space access function found

2024-09-28T13:44:12.343 app[6830139f723248] lhr [info] INFO Starting init (commit: 20f21dc5f)…

2024-09-28T13:44:12.373 app[6830139f723248] lhr [info] INFO Mounting /dev/vdd at /app/data w/ uid: 0, gid: 0 and chmod 0755

2024-09-28T13:44:12.380 app[6830139f723248] lhr [info] [ 0.645511] EXT4-fs (vdd): VFS: Found ext4 filesystem with invalid superblock checksum. Run e2fsck?

2024-09-28T13:44:12.382 app[6830139f723248] lhr [info] ERROR Error: couldn’t mount /dev/vdd onto /app/data, because: EBADMSG: Not a data message

2024-09-28T13:44:12.382 app[6830139f723248] lhr [info] [ 0.648171] reboot: Restarting system

2024-09-28T13:44:12.460 app[6830139f723248] lhr [warn] Virtual machine exited abruptly

2024-09-28T13:44:14.438 runner[6830139f723248] lhr [info] machine exited with exit code 0, not restarting

2024-09-28T13:44:16.380 proxy[6830139f723248] lhr [error] [PM03] could not wake up machine due to a timeout requesting from the machines API

2024-09-28T13:44:16.454 proxy[6830139f723248] lhr [info] Starting machine

2024-09-28T13:44:16.456 proxy[6830139f723248] lhr [info] Starting machine

2024-09-28T13:44:16.457 proxy[6830139f723248] lhr [error] [PM01] machines API returned an error: “machine still attempting to start”

2024-09-28T13:44:16.664 proxy[6830139f723248] lhr [info] Starting machine

2024-09-28T13:44:16.666 proxy[6830139f723248] lhr [info] Starting machine

2024-09-28T13:44:16.666 proxy[6830139f723248] lhr [error] [PM01] machines API returned an error: “rate limit exceeded”

2024-09-28T13:44:17.677 proxy[6830139f723248] lhr [info] Starting machine

2024-09-28T13:44:18.323 app[6830139f723248] lhr [info] [ 0.267904] PCI: Fatal: No config space access function found

2024-09-28T13:44:18.659 app[6830139f723248] lhr [info] INFO Starting init (commit: 20f21dc5f)…

2024-09-28T13:44:18.683 app[6830139f723248] lhr [info] INFO Mounting /dev/vdd at /app/data w/ uid: 0, gid: 0 and chmod 0755

2024-09-28T13:44:18.690 app[6830139f723248] lhr [info] [ 0.631214] EXT4-fs (vdd): VFS: Found ext4 filesystem with invalid superblock checksum. Run e2fsck?

2024-09-28T13:44:18.691 app[6830139f723248] lhr [info] ERROR Error: couldn’t mount /dev/vdd onto /app/data, because: EBADMSG: Not a data message

2024-09-28T13:44:18.692 app[6830139f723248] lhr [info] [ 0.633659] reboot: Restarting system

2024-09-28T13:44:18.755 app[6830139f723248] lhr [warn] Virtual machine exited abruptly

2024-09-28T13:44:20.335 runner[6830139f723248] lhr [info] machine exited with exit code 0, not restarting

2024-09-28T13:45:21.719 proxy[6830139f723248] lhr [info] Starting machine

2024-09-28T13:45:22.331 app[6830139f723248] lhr [info] [ 0.267546] PCI: Fatal: No config space access function found

2024-09-28T13:45:22.675 app[6830139f723248] lhr [info] INFO Starting init (commit: 20f21dc5f)…

2024-09-28T13:45:22.699 app[6830139f723248] lhr [info] INFO Mounting /dev/vdd at /app/data w/ uid: 0, gid: 0 and chmod 0755

2024-09-28T13:45:22.706 app[6830139f723248] lhr [info] [ 0.638241] EXT4-fs (vdd): VFS: Found ext4 filesystem with invalid superblock checksum. Run e2fsck?

2024-09-28T13:45:22.707 app[6830139f723248] lhr [info] ERROR Error: couldn’t mount /dev/vdd onto /app/data, because: EBADMSG: Not a data message

2024-09-28T13:45:22.708 app[6830139f723248] lhr [info] [ 0.640875] reboot: Restarting system

2024-09-28T13:45:22.771 app[6830139f723248] lhr [warn] Virtual machine exited abruptly

2024-09-28T13:45:24.271 runner[6830139f723248] lhr [info] machine exited with exit code 0, not restarting

Can I leverage the recently announced automatic repair feature for this?

Hi… You would likely need a fresh copy of the init system to get that new feature, :cherry_blossom:, and it’s not clear what exactly causes it to get updated. Personally, I would start with a simple fly secrets set EXTRA=1 (which should trigger a redeploy), and see if that changes the Starting init line.

Also, it would probably be prudent to copy the last known-good snapshot onto a volume of its own, since those age-off (i.e., auto-delete) after a while.

Hope this helps a little!

Added lhr, volumes

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.