Hey, this morning it seems most of the filesystem in one of my sprites is broken.
I can no longer sprite console or run most commands:
sprite console
mkdir: Input/output error
sprite x ls
/usr/bin/ls: general io error: Input/output error (os error 5)
sprite x pwd
/home/sprite
What does work is for example:
sprite exec -- sh -c 'for f in /.sprite/checkpoints/
… but I can’t access the “active” checkpoint nor the / path.
Sorry for the AI message here, but tried to let it help summarize the issue:
Sandbox has a broken overlay. /mnt is an empty tmpfs — the block devices backing the overlay’s lower and upper layers aren’t mounted, so all /usr/bin binaries and my user files fail with Input/output error (os error 5).
Shell builtins still work and old checkpoints are readable, but my live data is in the upper layer (/mnt/user-data/root-upper/upper) which is inaccessible.
Does anyone know a way to recover? I unfortunately don’t have any sprite checkpoints recent enough to help.
Hey, it definitely sounds like something is borked. Do you mind providing the sprite name that is having this issue? That way we can take a closer look at what is going on!
As an aside, I’m curious to know if this is some sort of auth issue just not being surfaced properly… if you run fly auth whoami, does it come back with the correct user? It might even help to do a fly auth logout and fly auth login , then retry the sprite console command.
It’s named hermes. I don’t believe it to be an auth issue, I’ve reauthenticated both CLIs and can also work fine in other sprites. Many thanks for looking into it!
Thank you for sharing that. I took a deeper look at your sprite. I upgraded/restarted it (let me know if this helps at all). Looking at the logs, when your sprite went to sleep around 11:52 UTC, a write to your underlying storage was mid flight, and it looks like the freeze caught it at a really bad moment. So when you woke the sprite back up, the storage layer couldn’t read that chunk from either place and started throwing I/O errors on anything that touched it basically. I know checkpoints aren’t a great option for you, but if I’m reading things correct, an auto checkpoint got created at 17:52 UTC after the corruption already happened, so it might be tainted.
By any chance are you able to cat any of the files you care about to see if they come up? Anything that reads without an error lives on a different chunk and should still be accessible.
Thanks, but afraid I still haven’t been able to read my files.
Claude said:
Top-level /home/sprite is readable and some regular files are readable, but stat/ls on /home/sprite/notes and /home/sprite/sites returns EIO. After several read-only probes, sprite exec now hangs even for /usr/bin/true, so the environment likely needs another restart or lower-level recovery before we can continue trying individual file reads.
It looks like the filesystem corruption is not limited to a few files: the overlay/storage layer returns EIO for specific directory metadata chunks like /home/sprite/notes and /home/sprite/sites, and after touching those paths the sprite’s exec/session agent appears to wedge entirely.