I have a paid account and three running instances. One instance is stuck and I can’t get to it via console or any other means. I can’t restore a checkpoint or reach it at all.
I can ping the instance but not access anything on any ports.
I believe API timing out on a single sprite while others work is a known Fly platform-side issue, not something that I can fix from the CLI? The control plane can’t seem to reach that specific sprite’s VM, which is why even reading metadata (checkpoints, sessions) hangs.
What can I do? I need access to this sprite urgently.
Thanks for your reply, I didn’t think to use the sprite session list command!
I am able to list the sessions on my sprite.
However, when I attempt to attach or kill, the sessions don’t seem to go away. For example, pid 15 on my sprite won’t die:
$ sprite session -s hermes kill --signal SIGKILL 15
{"type":"signal","message":"Signaling killed to foreground process group 405","signal":"SIGKILL","pid":15}
{"type":"signal","message":"Sending SIGHUP to process group 363","signal":"SIGHUP","pid":15}
{"type":"timeout","message":"Timeout after 10s, sending SIGKILL"}
{"type":"error","message":"Warning: process may not have terminated"}
{"type":"signal","message":"Closing PTY master","signal":"SIGKILL","pid":15}
{"type":"complete","exit_code":-1}
I can attach to some sessions, but none of the shell sessions seem to accept commands. I can type into the session (and even reattach from a different terminal and see my typing from the original attachment) but after hitting “enter” the commands won’t run.
i had to learn the hard way, that sessions even exist. now keep everything much more under control, and haven’t had issue like the one you’re reporting here.
the odd disconnect here and there but recoverable on my own.
hope we get a reboot command on the cli soon though