We have our application packaged in a discoverable GPT partition image and typically boot it using systemd-nspawn. We’re now exploring how to deploy multiple instances of this application in the cloud and are looking for orchestration options. Note that our application is not fitting docker principles. It is a multi-process application.
I came across Fly.io and found some similarities with what we’re trying to achieve. I’m wondering if your platform could support our use case.
From what I understand, Fly.io extracts the root filesystem from a Docker image and boots it using Firecracker. Would it be possible to start our custom root filesystem in this way?
We’re open to packaging our setup as a Docker container if Fly.io can then extract the root filesystem. But we’re specifically wondering:
Can Fly.io boot a root filesystem using systemd as the init process?
Is there support for running systemd-nspawn within that environment?
Thank you for your support—looking forward to hearing your thoughts.
What you need might be possible but it’s not necessarily supported - it’s uncharted territory and you’d be basically on your own.
You cannot override the init that Fly machines boot with; we inject a custom init process that then spawns a few other processes, notably the Dockerfile’s ENTRYPOINT and CMD entries (though both can be overridden). So if you find a way to make systemd happy as not-pid-1, you may be able to get it running. I can’t tell you if systemd-nspawn would work once you get systemd itself working; I think everything needed should be there as it’s really a VM running on top of a real kernel, not something already-confined like Docker proper.
So if you can package your system/application as a Docker image and ensure the ENTRYPOINT or CMD run systemd, it might just work - but you’ll have to experiment on your side to see if it’s feasible.
For completeness, the way most folks run multiple processes in a single machine is by using something like supervisord.
However, I’m not sure how reliable your application would be using this approach. As @roadmr said, it needs to be tested.
I’d appreciate your feedback because I’m in a similar situation to yours. If you find a working solution, please share your experience as it would be very helpful for others facing the same issue.
Thank you both for your time and response. I think using a tool that is not meant to be is going to bite me back in the future. However, let me take this opportunity and discuss with you (fly) about business opportunities with a technical propose.
The reason why we use nspawn is because we use the same software stack that we deploy to our embedded devices as in the cloud. The cloud version mocks hardware features. Our stack has start up order among services. I think this kind of testing is quite new in the systemd/embedded community but I am hoping it will pick up.
From technical point of view, systemd has a concept of systemd extensions. Currently nspawn doesn’t support systemd extensions but it has been in the wish list in upstream. I don’t believe it is complicated to implement the missing piece. One thought would be that the fly’s initialization and communication processes would be packed in an extension which gets added on top of the customer’s bootable root fs.
Would fly like to help me go through the process (requesting access to Pilot based machine), APIs etc so we give this setup a whirl and hopefully write a blog on fly.io called “how to start root fs packed as Discoverable GPT” with it’s current state?
My understanding is that we have quietly enabled Pilot for everyone. If your use the machine API and you include containers in your request per the blog post I linked, you should be able to run systemd as pid 1. If this is not the case, let me know and I’ll find out more.
As to the rest, I’ll be honest and say that I don’t know what Discoverable GPT is or, why it would be useful, or if it could work on fly; I’m just responding to say that if running systemd as pid 1 was the roadblock, there now is a way to do exactly that.
Creating an app using machine API is kinda complicated. I opened Fly Machines API and felt perplexed.
Is there a simple tutorial or an example of how to get started with Machines API?
And I don’t understand should we do some extra steps for systemd to work as PID 1 or would it work automatically right away after we specify containers?
systemd should work as PID 1 mathematically is you specify containers.
I plan to look into some usability enhancements for working with containers this week - if you can tell me a bit more about what you are trying to accomplish, that would both be helpful to me and would enable me to provide more helpful answers to your question.
thanks, I appreciate your help.
eventually I figured out how to create an app using API.
I looked here: Machines · Fly Docs and created a simple example app:
For completes, I have tried below to run systemd as pid 1 but failed to go in to the machine. Eventually the machine got killed with oom killer (exit_code=0,oom_killed=false,requested_stop=false)
2025-04-23 12:52:42.359
ubuntu container exited with code 1
2025-04-23 12:52:42.359
INFO Scheduling start of ubuntu in 32.512s
2025-04-23 12:52:42.358
INFO restart policy is 'on failure', restarting if possible. current restart count is 8/10
2025-04-23 12:52:42.357
INFO container exited exit status: 1, determining restart based on policy 'OnFailure { count: 10 }' name=ubuntu
2025-04-23 12:52:42.355
executable file `/lib/systemd/systemd` not found in $PATH: No such file or directory
It worked for me too, thanks. Looking at the systemd-journald logs, I believe the container section is through OCR. Is the software stack of a machine something like this: Firecracker → Fly host software → Containers ?
Bonus, I am struggling keeping the machines alive. They time out after 5 minutes. I tried autostop in Fly Machines API in the json config but no luck.