VMs booting with outdated system time after suspension - best practices?

I’ve observed that when my Fly.io virtual machines resume after being suspended, they initially boot with the system clock set to the time when they were suspended, not the current time. The system clock eventually self-corrects after handling a few requests, but initial requests still receive the incorrect time.

Questions:

  • Is this expected behavior for suspended VMs on Fly.io?
  • What’s the recommended approach to ensure accurate time immediately after VM resume post-suspend?
  • Are there any built-in mechanisms I should be using to handle this situation?

Has anyone else encountered and solved this issue?
Any recommendations would be greatly appreciated!

2 Likes

:eyes: Subscribed.

We observed the same and had to switch back to "stop" because the clock skew caused crypto signature failures for all requests serviced before the system clock resynced.

1 Like

Anyone have any recommendations?
We are running a django app on focal.

We were thinking of telling our machine to listen to a the suspend signal in fly and trigger a clock update before allowing handling requests. Though we’re not sure what that signal would be.

It looks like docker sends a SIGCONT signal when restarting after pause, see:

https://www.kernel.org/doc/Documentation/cgroup-v1/freezer-subsystem.txt

Is there some equivalent in firecracker(?) that someone could point to?

1 Like

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.