2025-04-18T18:20:32Z [info] Virtual machine has been suspended
2025-04-18T18:54:07Z [info] | DEBUG | Server current UTC time: 2025-04-18T18:20:26.366985+00:00
At 18:20:32Z the machine was suspended and at 18:54:07Z I logged a system time of 18:20:26, i.e. few seconds before suspension.
Also repeating the logs for +2s and +4s still issued the wrong system time but 2 resp. 4 seconds later than before:
2025-04-18T18:54:09Z [info] | DEBUG | Server current UTC time: 2025-04-18T18:20:28.368798+00:00
2025-04-18T18:54:11Z [info] | DEBUG | Server current UTC time: 2025-04-18T18:20:30.369548+00:00
If anyone could help me to figure out what can be done for having the right time (waiting for N seconds or other solutions) that would be awesome. Also I would really like to keep the option
auto_stop_machines = “suspend”
This was in the original Suspense thread - it’s a known issue but I’m not sure where Fly is at in terms of addressing it. For clock sensitive systems, I don’t think you can properly use “suspend” here.
In your JWT decoder, perhaps you can make an NTP call to an external time service. This will add latency, but you may decide that is a reasonable trade-off for keeping this machine suspendable.
[…] the guests will constantly update time
to stay in sync with host wall-clock. They do so using cheap para-virtualized
calls into kvm ptp instead of actual network NTP traffic.
In my experiments, phc_ctl /dev/ptp0 get is pretty accurate even when date (the Linux guest’s internal time) is 5+ minutes off.
Sun Apr 20 03:45:46 UTC 2025 # output of `date`
phc_ctl[521.947]: clock time is 1745121052.124684276 or Sun Apr 20 03:50:52 2025
(I didn’t test really long suspends, however, .)
Glancing at the Linux kernel docs, it looks like there might be a POSIX API call a person could make, instead of shelling out to a subprocess. (Admittedly that’s not going to be convenient from Node, and the like.)
Having said that, I agree with @khuezy overall, that this all still seems a bit fragile for security-sensitive tasks… (It will be super-nice once Fly.io does declare suspend ready for production, though.)
Thank you @khuezy, @halfer and @mayailurus very much for your helpful remarks and suggestions!! Yesterday I implemented and tested the NTP call in the decoding process and I am quite happy with how it turned out to work.
But also your suggestion @mayailurus is definitely something I will have a closer look at.
So thank you all very much once again and have a great day